IWRL6432: MCAN interrupt does not trigger when receiving a message, but only sometimes.

Part Number: IWRL6432

Tool/software:

Hello experts, 

I have a very strange problem that we have observed. When connecting several of our CAN devices to a network we sometimes (about 50% of the time) can discover all of the devices except one. If we have 5 on the network, sometimes 4 will be discoverable. If we add another device, still only one is sometimes undiscoverable. We have had up to 30 devices and discovered 29 and as few as 3 and discovered 2. 

The nature of the problem is as follows: 

  • The problem device will still transmit messages, being triggered from an internal timer. 
    • This means the interrupt is working for TX
  • The problem device will receive messages to FIFO0. We observe the FIFO filling as messages are transmitted to that device, but the devices never make it out of the FIFO. 
  • The FIFO0 RX interrupt is never triggered.
  • The problem device is not always the same device. It is often the same device but can change/fix itself for a time through a power-cycle. 
  • The interrupt, when created, returns systemP_SUCCESS, indicating that the interrupt should be functional. 

We have a CAN RX task that operates as many other tasks do in the demo code, with a while loop and an indefinite semaphore at the top, waiting for a semaphore post (which is supposed to happen in the MCAN ISR). While it would be possible to watch the FIFO fill level and post the semaphore if the FIFO is not empty, that doesn't seem to address the core problem. 

Do you have any insight as to why the registers might not be posting the interrupt when a FIFO receives a new message? As far as I know the registers are enabled correctly, as shown by the n-1 other devices on the network. 

Thank you in advance. 

  • Hello.

    I am looking into this issue and will have an update by the end of the day tomorrow.

    Sincerely,

    Santosh

  • Hi Santosh, 

    I have some more details on this as I've been looking into it. 

    I initially believed that the only dysfunctional interrupt was the RX FIFO0 new msg but on further inspection I have found that none of the MCAN interrupts work from the MCAN_IR register. The TX interrupt only appeared to work because we have a definite timeout assigned to the TX semaphore pend rather than an indefinite one like the RX semaphore. 

    Here's a clarification to my question, 

    How would you recommend debugging the interrupt register functions? The interrupt initialization functions (register, enable, assign) do not have return values and the HW register functions that they call also do not have return values. Is my only option to replicate the issue on an eval board and browse the registers in real time to see if they are enabled? 

    Additionally, what might cause interrupt registers to not be enabled properly? Is there a CAN condition which would cause the registers to not enable correctly? Is there a CAN condition which would disable the registers? 

    Thank you for your time and effort. 

    ds

  • Hello Daniel.

    Apologies for the delay in response; I was still looking into this issue.  Are you calling the MCAN_enableIntr and also enabling the MCAN_INTR_SRC_RX_FIFO0_NEW_MSG or any othe FIFO interrupts?

    With regards to debugging, if you have an XDS on the device you can try connecting to CCS Debug and look at the register view or the contents in memory to see the status of what is happening in those registers.  However, trying on the eval board will help determine if there is any other issue with the custom design or if it is something else.

    Sincerely,

    Santosh

  • Santosh, 

    Thank you for the response. As far as I can tell I am calling the MCAN_enableIntr. I have a CLI_write() before and after the MCAN_enableIntr and both are being called. The MCAN_enableIntr doesn't return anything so it is hard to know if it was successful or not. As far as enabling the correct registers right now we have the MCAN_INTR_MASK_ALL set. 

    We'll investigate a little more with the eval board and I may return with some more details for this question. 

    ds

  • Thank you for the update Daniel.

    Please feel free to update this thread once you have run the tests on the EVM.

    Sincerely,

    Santosh

  • Hi Santosh,

    We are still running tests on the EVM but are wanting to read the status of the interrupt enables. Looking at the TRM it looks like we're interested in the MCANSS_IES register. Do you have any recommendations on reading this register? There is not an easy way to read it in the API it looks like. Ideally I would have something similar to the MCAN_getIntrStatus() function. 

    Additionally, after reading the register how would you recommend I interpret it? The details in the TRM have the whole register as reserved, so I hesitate to draw absolute conclusions with little information on what is actually contained in the register. 

  • Hello again, 

    In addition to my last post, I am also interested in the MCAN_ILS register. There is an API call to read this using MCAN_getIntrLineSelectStatus(). Would this function tell me if the interrupt is enabled, or only if the interrupt line has been assigned that particular interrupt? 

  • Hello Daniel.

    You can use the HW_RD_REG32(addr) function to read the register by providing the address of the register, and bit 0 of that register is all you need to be able to identify that an interrupt occurred.  Please refer to 12.4.5.1 External Timestamp Counter in the TRM for more information.

    Sincerely,

    Santosh

  • Would this function tell me if the interrupt is enabled, or only if the interrupt line has been assigned that particular interrupt? 

    It just tells you what interrupt line is assigned to that interrupt.

  • Santosh, 

    We have made some significant discoveries. Looking at the IE register showed that in the error state the interrupts are still enabled, so we should be getting the RX FIFO new msg interrupt, but it is not triggering. 

    We duplicated the error on the eval board and have narrowed down the exact scenario when we see the problem. 

    We have implemented the SBL into our device, it appears that if ANY messages are received on the CAN bus during the transition between the SBL image and the normal operation firmware the device enters the error state in which CAN interrupts are not triggered. 

    We have further narrowed it down to these lines of assembly, 

    __asm(" MSR MSP, %0" : :"r" ((uint32_t)*(uint32_t*)appBootVec));
            __asm("BX %0"::"r" ((uint32_t)*(uint32_t*)(appBootVec + 4)));

     These lines are called as the very last lines of code in the SBL. If messages are received before (we do initialize MCAN both in the SBL and in the normal operating firmware) these, or after the operating image is booted, the communication works just fine. 

    Just as a reminder of the problem, on a physical level our CAN communication is working correctly, we are able to transmit, and receive messages, but the interrupts which usually indicate the TX and RX have occurred are not triggering. We originally believed that this was happening randomly, but with the latest discoveries we have made the problem deterministic, again, by transmitting a message during the handoff between the SBL and typical operating firmware. 

    Do you have any insights on what might be happening during the transition that could cause the problem? Why would receiving a CAN message during the transition cause an issue? 

    Thank you, 

    ds