This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MSP432E401Y: Software Crash due to Mutex in CCS/TIRTOS

Part Number: MSP432E401Y
Other Parts Discussed in Thread: SYSBIOS

Hi,

Microcontroller : MSP432E401Y

Using CCS Tool

 

The Software is running smoothly when it is configured to debug mode.

But when we try to configure to release mode after few minutes it is getting hang.

 

We are getting issue from Mutex.

While doing this, I was observing in R0V window. I am attaching the error which popped up.

 

Can anyone please help me to resolve this unexpected behavior.

 

Thanks & Regards

Ashwini Kumar

  • These can be a bit tricky to debug when they just stop. Are you able to:

    1) Suspend debug and see where the code is running - is it stuck somewhere?

    2) Can you see the Task, detailed option in ROV?

    Best regards

    Jim

  • The Software is running smoothly when it is configured to debug mode.

    But when we try to configure to release mode after few minutes it is getting hang.

    Please carefully check the differences between the two configurations regarding the build options. Typically the difference is mostly related to optimization (The "release" configuration usually has more optimization enabled, but this can vary depending on the project) but there may be something else missing that is required.

    Thanks

    ki

  • yeah code got stuck in queue.c. I am attaching the image where it got stuck.I was monitoring the task , so got to know from which task it got stuck.

    Then in that task i commented mutex and it started running smoothly. But didnt't understood why this unexpected behaviour.

  • Hi,

    I checked the difference and found the release was having optimization and in dubug no optimization is used.

    When i remove the optimization then in "release build" it works smoothly.

    The code used to get stuck in queue.c. I have attached the image in previous reply.

    This is how i am defining the mutex in code: 

    MutexP_Handle CommunicationProtocol::GetMutex(MutexP_Handle handle) //!< [in] handle - Handle for mutex
    {
    MutexP_Params params;
    MutexP_Params_init(&params);
    handle = MutexP_create(&params);
    return handle;
    }

    mChargerPortLock = GetMutex(mChargerPortLock);

    bool Send_1_byte_to_slave_over_I2C(uint16_t slaveID) //!< [in] slaveID - Device slave id
    {
    /* check communication link between MCU and slave device */
    I2C_Transaction i2cTransaction_test = {0};
    uint8_t writeBuffer[1];
    bool status;
    bool addressAcknowledgement;
    uint32_t key;

    writeBuffer[0] = 0x01;

    i2cTransaction_test.slaveAddress = slaveID;
    i2cTransaction_test.writeBuf = writeBuffer;
    i2cTransaction_test.writeCount = 1;
    i2cTransaction_test.readBuf = NULL;
    i2cTransaction_test.readCount = 0;

    key = MutexP_lock(mChargerPortLock);
    int_fast16_t i2c_status = I2C_transferTimeout(mpi2chandle, &i2cTransaction_test, 100);

    if ( (I2C_STATUS_BUS_BUSY == i2c_status) || (I2C_STATUS_INCOMPLETE == i2c_status) || (I2C_STATUS_TIMEOUT == i2c_status) || (I2C_STATUS_ADDR_NACK == i2c_status))
    {
    addressAcknowledgement = false;
    }
    else if ( I2C_STATUS_SUCCESS == i2c_status )
    {
    addressAcknowledgement = true;
    }
    MutexP_unlock(mChargerPortLock, key);

    return addressAcknowledgement;
    }

  • Hi, 

    Attaching the image where getting stuck with more details. The right side is the disassembly code.

    Hope it gives better clarity of what can go wrong in release build. I added level 0 optimization and the issue arises after few minutes.

  • The processor can't just get stuck on that line - it is just another op-code. If you single-step, it should just move on!
    The most likely possibilities are:
    This is a semaphore post so, somewhere around there, it is indicating that another thread can run. You might expect to see it step into the task scheduler then return to a different task but this can be a bit tricky to follow!
    Or
    An interrupt has occurred, then the interrupt handler has allowed another process to pre-empt.
    Or
    It has branched to an exception handler - perhaps one of those pointers is null or out of range.

    The ROV task view shows which task is running - which should usually be the Idle task. You might catch another task running, but if you keep refreshing, it should usually be idle running. Use the Detailed view option to check no task has exceeded its stack size. Can you post an image of the ROV task view?

    If you suspend execution from the debug menu, you will see if the processor has jumped to the exception handler - it will be on a line with the comment "loop here forever". This is usually because of task stack overflow, or some other assert or null pointer de-reference.

  • Hi Jim,

    I tried to do all the steps which you mentioned.

    I runed the code and in ROV task view made it to display detailed view. Please find the attached image below.

    The code got stuck and suspended execution , i can see the processor jumped to exception handler. It was bus Fault. Please find the attached image below.

    Now I pressed step-in and code jumped to queue. Please find the attached image below. 

    Now again I pressed step-in. I think The code is in loop forever because it again jump to exception handler and then to queue.

  • It is a little hard to see what is happening without all of the code but we can see that it appears to get stuck in the BatteryManagement_task which the ROV shows as running.

    It appears to be having a bus fault from within the Mutex code - perhaps a pointer is not set correctly so trying to address thin air!

    Since this code is pretty well tried and tested, I would guess that it is somehow not being used not quite correctly!

    Are you sure that mChargerPortLock = GetMutex(mChargerPortLock); has been called before trying to use the mutex?

  • I checked the difference and found the release was having optimization and in dubug no optimization is used.

    When i remove the optimization then in "release build" it works smoothly.

    Thanks. It looks like an issue with the optimized code.

  • Yes , i have initialized "mChargerPortLock = GetMutex(mChargerPortLock);" in the constructor.

    The code is working fine in release build when optimization is off.

    But even if i introduce the basic optimization i.e O0 level ,i am getting same error from mutex.

    Can you suggest something what can go wrong.

    in the release.cfg "var GateMutexPri = xdc.useModule('ti.sysbios.gates.GateMutexPri');" this is declared.

  • yeah , Can you suggest something what can go wrong.

    in the release.cfg "var GateMutexPri = xdc.useModule('ti.sysbios.gates.GateMutexPri');" this is declared.

  • Does it fail on the first pass? If so, can you set a breakpoint just before and step in? It would be useful to know what exactly what is causing the bus fault. I assume that it is some pointer that is null or pointing somewhere illegal.

    It may well be that the optimiser is not directly the problem, but just changing the order that threads try and access something. It looks like the bus fault happens in the mutex so, unless you are changing between release and debug builds of the library, this is probably identical code!

  • No, it fails after approx 20 minutes. That's why not able to trace it with the help of breakpoint.

  • Hi,

     Not that I know the root cause. Can you increase the heap memory as well as the stack? Does it make a difference? Is this problem occurring on your own custom board? Can you replicate the same issue on a different custom board or on a LaunchPad?

  • Hi Ashwini,

      I have not heard back from you. You have created a new post for a different question at https://e2e.ti.com/support/microcontrollers/arm-based-microcontrollers-group/arm-based-microcontrollers/f/arm-based-microcontrollers-forum/1215000/msp432e401y-how-to-calculate-spent-time-in-idle-task/4609666?tisearch=e2e-sitesearch&keymatch=%20user%3A547112#4609666. Therefore, I assume you have passed the issue you have in this thread. I will close the thread for now. If you have any update, you can write back to this post and the post will change the status to open. I will then be notified. 

  • Hi, 

    I was trying to understand more about this problem. So took time .

    The I2Ctransfer_timeout API is a blocking call. In I2C mode blocking, the I2C driver performs the I2C transaction synchronously and blocks the application until the transaction is completed or until an error occurs. This mode simplifies the application code. but it is causing the system to hang if the I2C Transaction takes too long.

    This is exactly happpening in my case. Software is getting hang, so basically not due to mutex.

    This function is getting called internally(refer above attached image) when i use I2Ctransfer_timeout API.

    and in this it is getting sometimes stuck here(refer below attached image).

    Can you suggest anything to solve this issue?

  • Hi,

    What is your timeout value?

    Can you increase the timeout value?

    Can you use logic analyzer to see what is going on the bus?

    Are you sending the correct signals on SCL and SDA buses?

    Is the slave acknowledging the command?

    Is the slave stretching the SCL to create a wait time?

    What if you call I2C_transfer instead of I2C_transferTimeout? Does it make a difference?

    There are some examples in the I2C.h File Reference guide. file:///C:/ti/simplelink_msp432e4_sdk_4_20_00_12/docs/tidrivers/doxygen/html/_i2_c_8h.html. Can you take a look?

     There are 2 more examples at below locations. Can  you take a look?