This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC3200MOD: FaultISR called by FreeRTOS scheduler when using MQTT

Part Number: CC3200MOD


Hello everyone,
I'm working on a custom board mounting the CC3200MOD with the latest service pack: 1.0.1.11-2.9.0.0
I'm using SDK1.3.0 and compiling with gcc. The system use FreeRTOS and has 4 tasks: main application, connectivity, mqtt broker and mqtt client. On startup it loads the binary application from the application_bootloader so it is able to perform OTA Update. The used peripheral are: gpio, pwm, analog in, i2c, spi and uart (for debug). The mqtt broker accepts unsecured connections while mqtt client opens a TLS V1.2 socket to a remote server.

My problem is that after several hours of normal operation it enters in the FaultISR trap and remains blocked there.

After many tries I found out that when the mqtt client is heavily stressed, sending and receiving around 10 messages per second, the fault shows up within minutes. But never after the same time (sometimes after mere seconds, others after more than 40 minutes) and don't seem to block in the same function as I added several debug messages in the mqtt_client functions.

I also experience the problem when limiting the execution to only 2 tasks: connectivity and mqtt client (only uart peripheral). After the initialization (wlan connection and mqtt connection) the only running task is to check if an incoming message has arrived and send several messages in reply. If another incoming message arrives while sending the reply messages, it is saved by the receiving task but later it is dumped by the sending task.

If I only send dummy messages without receive requests or receive messages without replying to them, the fault doesn't shows up.

I thought that the fault may be caused by the allocation of memory for the incoming message while sending another one, so I replaced the dynamic memory management with static allocation all over the system, but no luck.
I already triple checked the code to find memory leaks, zero divisions and bad array management.

Most of times the execution blocks after printing these lines from the mqtt client functions:
C: Msg w/ ID 0x0000, processing status: Good
C: Alloc for 0 20030de4
C: Alloc for 3 20030dc0

So my questions are:
1- Does anyone have experienced such problem and how it was fixed?
2- Is it possible that I'm facing a bug in the service pack or in the mqtt SDK?
3- Is there a way to find exactly where a fault is generated?

  • Hi Rosario,

    This is not an issue we've seen caused by the mqtt client library. You can take a look at the mqtt_client example in the SDK using FreeRTOS for implementation. Unfortunately, there is no good way to find where the fault is generated in your code besides checking print statements. If there is a memory issue or buffer overflow, this can cause unexpected behavior.

    Also try:
    1. Increasing the stack size
    2. Check that your malloc is not allocating NULL pointers

    This may not be relevant to your issue, but you can also take a look at this E2E thread: e2e.ti.com/.../402010

    Best regards,
    Sarah
  • Hi Sarah,
    after commenting most of the code except the send-receive mqtt messages, the only operations that were used were sl_mqtt_client.
    The received messages were saved in a static buffer with a size much greater than the messages, so I don't think it is an allocation problem, although the Fault shows up after many loops and never after the same number of loops.

    My FreeRTOS implementation is derived from the SDK example as well as the mqtt_server and timers.

    I increased the stack to 3K but the Fault still shows up.
    Just for testing, I reduced the stack to 1K and after the first message arrived, the system hooked in the StackOverflow infinite loop. Can you tell me if this hook always works?

    Regarding the memory allocation I used the FreeRTOS wrapper and I rechecked my code, whether mem_Malloc() returns NULL pointer I print and return from the function.

    As I could understand from the thread you linked, Vinu had the FaultISR caused by the system timer interrupts that was interfering with RTOS, but I'm already using general purpose timers for my events.

    Now I'm using the LAUNCHXL jtag interface to use gdb from the pc, but (I'm a newbie of gdb) it only recognise standard registers and cannot show for example the contents of the "HardFault Status Register".
    If I check the back-trace it only says that the FaultISR was called by the prvPortStartFirstTask() which execute assembler code.
    I also tried to copy the the contents of the "UsageFault Status Register" on the R4 register before the infinite loop with this line: __asm volatile(" ldr r4, =0xE000ED2A \n"); and checked the R4 register with the "info registers R4" gdb command, I'm not sure it works correctly, if yes the bits DIVBYZERO, INVPC and INVSTATE are asserted (I'm looking in infocenter.arm.com/.../Cihcfefj.html for Cortex-M4 reference).

    I tried adding several debug when entering and exiting from mqtt functions but the execution never stop on the same call.

    Do you have any idea why sending and receiving mqtt packets through an encrypted channel causes an ISR fault?
    Do you know a smarter way to use gdb for debugging the execution of the different tasks within FreeRTOS?

    Regards,
    Rosario
  • Hi Sarah,
    I also tried to stress the mqtt_broker channel sending and receiving many messages as i done with the mqtt_client but the fault didn't showed up. In my code the incoming messages and the replies are handled in the same way for the broker and the client.
    Another thing that is not completely clear for me is the HEAP_SIZE that I declare in the compiler properties: is that only the heap needed by FreeRTOS for allocate the stack for the tasks or is also region used by FreeRTOS itself for allocation? I declare 2K more than the tasks needs, is that enough?

    If a malloc need more space than the available heap will the system hook in malloc failed loop, or may it overflow on to the heap region of another task? As I said I tried also without any malloc so my problem shouldn't be this, but is worth ask to be sure how these mechanisms works.