CC3235SF: Bus faults during interrupt handling problem

Part Number: CC3235SF
Other Parts Discussed in Thread: SYSBIOS

Tool/software:

Hello,

I am seeing a bus fault during interrupt handling. It isn't the easiest thing to reproduce, but I can usually see 2 or 3 a day. This problem occurs randomly. In the example below the system hadn't even completed initialization and initial connections. But I have also seen it when the system has been sitting idle with no messaging traffic other than a periodic MQTT message from our host, or when I am heavily exercising MQTT messaging. There doesn't appear to be any single way to trigger the problem. I am posting this because I've become a bit stuck in debugging this problem so I am looking for suggestions on additional things to look at, or thoughts on what is happening.

Below is an example stack trace of what I see when the problem occurs. It should be noted that the interrupt being handled varies, but the last two functions in the trace before the fault are always the same. Trace back:

In this particular case the Simple Link interrupt was being handled. The PC indicated for the fault was 0X2002e2a0 which is outside of the code space and in the RAM. There are some interesting things to note in the CPU registers. First, that address appears in R12. Second the comment associated with the ti_sysbios_knl_Task_unblockI__E function indicates that interrupts should be disabled. However, looking at the PRIMASK, it is not set so I don't believe interrupts are disabled. In fact if you look at the SCB ICSR register you can see there is also a SYSTICK interrupt pending while the bus fault is active. I suppose it is possible the SYSTICK interrupt fired before the bus fault.

Thanks, any input is appreciated.

Regards,

John

  • Hi John,

    Thanks for reaching out, this seems like a complicated one to debug. Give me some time to think of some ideas to follow up with.

    Best,

    Rogelio

  • Hi Rogelio,

    Thanks. I appreciate anything you can think up. I agree it is a complex one to debug! I've developed in several proprietary RTOSs in my career, and these types of issues are painful to find.

    Regards,

    John

  • Hi Rogelio

    Have you had any thoughts on this?

    I have found a way that reproduces this problem 100% of the time. We occasionally receive a spurious MQTT_EVENT_CONNACK from the stack. It is spurious in that we did not disconnect from MQTT and reconnect. This happens a few times a day. The stack trace is below.