This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

RM46L852: RM46L852: Occasional un-acked CAN frames

Part Number: RM46L852
Other Parts Discussed in Thread: HALCOGEN, SN65HVD266

I am working on a system that uses two Hercules MCUs connected via a CAN bus using SN65HVD266 transceivers. The CAN modules for both Hercules devices are configured in Halcogen as follows:
Bite rate: 125 Kbit/s
prop delay: 1000ns
sample point of reference: 70%
CAN input freq. (VCLKA1) 100 Mhz
The intent is for the system to support a max cable length of 1500' but for this test the cable is 25' long.
Looking at the CAN bus with a differential probe I occasionally see that frames do not get an ack bit. These are marked by my scope and I see a subsequent retransmission of the frame that does get acked.
In the scope traces that follow Ch1 and Ch2 are the TTL Tx/Rx  signals at the transceiver on the receiving end of this transmission and Ch4 is from a differential probe on the CAN bus at the transmitting end.

This plot shows an unacked frame followed by a retransmit that gets acked.

This plot shows details of the missing ACK and the receiver signalling frame error.

This plot show details of the retransmitted frame with an ACK generated by the receiver.

This plot includes the TTL Tx signal at the transmitting side to illustrate the prop delay from transmitter Tx signal to Receiver Rx signal.

What would prevent the receiving CAN module from acking this frame given that the data at it's Rx pin appears to match the transmitted differential data?

  • Hello Julian,
    I don't know what is causing the error, but it looks like the receiver thought there was an error in the frame. I suggest you enable the status change interrupt and in the interrupt routine do a quick check of the error and status register. If the last error code (bits 1:0 of DCAN ES) are not 0 or 7, then store the value in a static RAM location. You can use a breakpoint on that condition and see why the receiving device thought there was an error in the CAN frame.
  • Hello Bob,

    Doug here. I am a software developer on this project. Regarding the status change interrupt, it provides notification on RxOK, TxOK, PDA and WakeupPnd. When triggering on RxOK and TxOK doesn't that mean the error code will always show as good just because of the type of event? The error code field is cleared to 0 when a message has been transferred without error. The other two notification types are never present in our system.

    I did however implement the interrupt test you discussed, but never read a last error code (DCAN ES 2-0) that was not 0 or 7. 

    I also implemented the error notification interrupt based on reaching the error levels warning, passive and bus off counts. But it is never triggered, as our error count does not get that high. We do not get enough contiguous errors, they will count up but start counting back down to zero as good packets are received.

    Other than these interrupts and the error and status register (ES) is there anything else we can monitor? The Hercules does not appear to be telling us much about what is going on in the CAN Core.

    Doug

  • The status change interrupt should occur on any status update. Detecting an error frame should cause the status to change with a last error code. You are correct, that error code will be overwritten when the next frame is received. But, unless you are missing an interrupt, it does not make sense that the CAN module would give an error response without indicating in the error status register why.
  • I think I may have found out why I do not see an error status.

    The HALCoGen generated code includes a file called "notification.c" with the function canStatusChangeNotification(). This is where I have been placing my own interrupt code. In reviewing the origin of this function call within the file "can.c", I notice that the ES register is read prior to the calling of this function. According to the user guide reading the register always resets the LEC to 7. So when I read the LEC I will never see an error code, even the ES notification parameter passed to this function has the lower 3 bits cleared.

    I will need to place my own interrupt code within the originating interrupt within the can.c file. I will investigate this further, and hopefully will start to see some meaningful error codes.

    Thanks Doug
  • Thanks Doug. Posting this will help other people in the future.
  • Our s/w has been updated so that we detect errors generated by each of the two the Hercules CAN modules on the bus and increment counters for each error that occurs. It's difficult to be certain but iId say that many of the reported errors are the result of a CAN transmission that has gone very wrong. In the scope capture below, Ch1 and Ch2 are the CAN Tx and Rx signals between the Hercules and the CAN transceiver. Ch4 is the CAN bus probed with a differential probe, and CH3 is a signal that the Hercules on the other end of the bus pulses when it detects a CAN error.

    Clearly the transmitting Hercules is driving a dominant bit on the bus for many cycles just after it begins its transmission. Any thoughts on what could cause the Hercules CAN module to behave in this manner? The transmission before was normal and the retransmit that follows is fine too.

       

  • Not sure what is happening. Can you check that the software is not somehow setting the INIT bit in DCAN CTL?
  • Doug has verified that the init bit in the DCAN CTL reg is not being inadvertently set. I see that there is a test mode that would allow the CAN Tx pin to drive a dominant value. I can't imagine how we'd trigger this functionality inadvertently but we'll have a look just the same.

    Thanks for your help with this issue.
  • Bob,

    We configured one of the unused CAN buses to drive the CAN sample pulse so that we could be certain that the bus timing was ok. There are no timing issue however we did discover that the CAN errors are the result of the CAN sample pulse stopping in mid cycle. The sample clock seems to stop a lot even with no CAN activity. What type of event can cause the CAN module clock (or at least the sample clock) to stop? 

  • OK, I did the same thing. I have CAN2 receiving a message through the transceiver. I attached CAN2 RX to CAN 3 RX and configured CAN3 TX to be the sample clock. The CAN sample clock is a pulse the width of one VCLKA period. Using the default settings VCLKA is 110MHz which makes the sample clock only 9ns wide. I had trouble catching a 500KB CAN message and not missing the CAN sample clock due to the oscilloscope sample rate. I can see a continuous CAN sample clock when I zoom in. I changed VCLKA to be sourced by OSC instead of VCLK (I have a 16MHz oscillator). HALCoGen recomputed the settings for the 500KHz baud rate. Now I can see a continuous sample clock with the CAN message.
  • We've found the root cause of our CAN issues. Our diagnostics call the check PLL slip function which seems to cause some disruption (~150 us) to the CAN clock. Doug has removed the call to the check PLL slip diagnostic and our CAN bus runs with no errors.

    Thanks again for you help in running these errors to ground.