This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28x: succesful transmission of a message by the eCAN

Dear C2000 team,

Can you help me with the following issue regarding the eCAN of the TMS320F28x:
According to the chapter 14.6.2.2 of the specification SPRUI10 – Dec 2018,

  • At the step 3, after a successful transmission, the transmit-acknowledge flag is set
  • At the step 4, the TRS flag is reset to 0

I do focus on these bits to check that CAN messages are sent successfully.

During the last days, I am not really able to match the specification and my tests.

For example as described in §14.6.2.2.3, I cannot always get the transmit acknowledge flag.

How can I work around this to get a realibe confirmation?

Looking forward for your feedback.

Best regards,

Nico 

  • Nico,

                Based on your post (and offline email), it appears transmission works correctly for several thousand seconds, freezes for a few seconds and automatically recovers and repeats this cycle. Is this correct? 

    Exactly what happens prior to and during this freeze? Are there error frames on the bus? Could the node be in bus-off condition and comes out of it automatically? Is ABO bit set to 1? Is it that all transmission ceases during this freeze?

  • Yes, it is correct. The transmission of one message works as expected during several thousand seconds. Then whereas nothing special (i.e. no new participant on the CAN bus, no supply voltage drop, …) happens, the transmission of this single message does not occur anymore during several seconds. Because others messages from the same device and from other devices are still sent, I would say that the CAN bus is all right. Unexplained recovery occurs afterward and the issue may occur again several thousand seconds later. I cannot really give you more details, because it is a matter of customer feedback. On my own test bench, when the device is alone, the issue does not occur. Thus I guess that the issue occurrence is linked to the CAN bus load which is nevertheless always smaller than 60%.

     

    Because today I cannot really reproduce the issue on my test bench, I do a kind of review of the source code vs. TI specification SPRUI10-Dec2018. Today, my main worry is linked to the following feature: one single eCAN mailbox is used to send several CAN messages which have different identifier. Here above, the message which is not sent, shares its mailbox with two other messages / identifiers. Thus according to §14.6.2.1, the mailbox is configured for transmit. Then, based on §14.6.2.2, the transmission request is set. In my opinion, the key point is now to ensure that the message has really been sent in order to (re)configure the mailbox for the next message i.e. identifier. If such successful transmission key point is not performed correctly, then the §14.6.2.1.1 and its transmission request reset can abort the pending transmission. If the CAN bus load is low, then I guess that the message is sent immediately. If the bus load increases, then the message transmission can be postponed and simultaneously the abort probability increases. There is perhaps a weakness in my source code which detects a successful transmission. This is the reason why I would like to focus first of all on how to ensure that a successful transmission occurs.

     

    As read in another thread “ECAN checking transmit-acknowledge flag CANTA of mailbox transmitted”, for example I get into trouble to handle the transmit acknowledge flag. The §14.6.2.2.5 states that the transmit acknowledge must be cleared for the next transmission. In the source code it is implemented like this:

         eCanAShadow_s.CANTA.all = ECanaRegs.CANTA.all ;

         eCanAShadow_s.CANTA.all ^= (uint32_t)(1UL<<MailboxIndex_e) ;   /* (§14.6.2.2.5a) */

         ECanaRegs.CANTA.all = eCanAShadow_s.CANTA.all ;

    //     do

    //     { /* wait until read CANTA is false (§14.6.2.2.5b) */

    //         eCanAShadow_s.CANTA.all = ECanaRegs.CANTA.all ;

    //     } while( ( eCanAShadow_s.CANTA.all & (1UL<<MailboxIndex_e) ) != (uint32_t)0UL ) ;

    The do-while lines for §14.6.2.2.5b are curiously disabled because they stop the whole CAN communication! I do suspect an endless loop which generates a watchdog reset.

    Thus on one hand I get a source code which seems to work (i.e. which works during several thousand seconds) but on the other hand it does not match the TI specification.

    Therefore how the §14.6.2.2 “transmitting a message” must be please implemented? Do you please have any kind of template?

  • Today I do have a second question or the same kind of question for the mailbox configuration. How is it possible to ensure that the configuration of a mailbox for transmit has been successfully performed?

    Because of the feature one single eCAN mailbox to send several CAN messages which have different identifier, there is a kind of endless loop between the §14.6.2.1 “Configuring a Mailbox for Transmit” and the §14.6.2.2 “transmitting a Message”. My first question “how to ensure that a successful transmission occurs” did focus on the transition from §14.6.2.2 transmitting to §14.6.2.1 configuring. Now my second question “how to ensure that the configuration of a mailbox for transmit is successful” does focus on the other transition from §14.6.2.1 configuring to §14.6.2.2 transmitting.

    After the step §14.6.2.1.4 where the mailbox is enabled, is it possible to set directly the transmit request? Or is there a kind of delay to fullfil?

    I wonder if the loss of Tx message could not be linked to a too fast call of transmitting after configuring…

    Thanks in advance for your feedback.

  • I think clearing the TRS bit deliberately, as outlined in 14.6.2.1.1, is unnecessary. I think what is important is that any transmission initiated before is completed before reconfiguring the mailbox for the next transmission. So, it is sufficient to ensure TRS bit has cleared (signaling transmission of the previous frame has been completed). Transmission of that frame should not be aborted without good reason. 

    To answer your question "how to ensure that a successful transmission occurs", you do that by monitoring the TA and TRS bits. Of course, you can trigger an interrupt based on successful completion of transmission. 

    How is it possible to ensure that the configuration of a mailbox for transmit has been successfully performed?

    After ensuring completion of previous transmission, you only need to disable the mailbox and write the new MSGID. These happen in the realm of CPU clock cycles. 

    I wonder if the loss of Tx message could not be linked to a too fast call of transmitting after configuring…

    You only see the issue very sporadically, correct? You also mention that other mailboxes in the module continue to transmit just fine. So, this is likely some boundary (or race) condition that the application s/w is not handling properly. 

    Have you tried using the time-out mechanism to alert you to a transmission that doesn’t complete in time?

  • I agree that if it is previously (i.e. at the end of the transmit function) ensured that the TRS flag is reset to 0, then it sounds unnecessary to set the TRR flag at the beginning of the mailbox configuration function.

     

    Yes, it was correct that the issue is very sporadically at my test bench. Nevertheless, at the beginning of the week, I got a CAN bus measurement file from the customer. I have introduced this file in a replay bloc of my CANalyzer. In such CAN bus setup, after several 100s, the issue of transmit message loss does always occur on a statistical point of view. If the focus is set on the cycle time of the Tx message, then unexpected too long cycle times do always occur. Nevertheless for each replay of the measurement file, the error pattern (i.e. the message current cycle time curve) is never identical. Thus I agree that it should be a race condition which triggers a weakness in the software.

    Now while the customer measurement file is endless replayed by my CANalyzer, to speed up the analysis and because I am not really keen on statistics, I focus on the TRS flag. At the beginning of the mailbox configuration function, a counter of unexpected event has been implemented.

       ECanaShadow_s.CANTRS.all = ECanaRegs.CANTRS.all ;

       if ( ECanaShadow_s.CANTRS.bit.TRS30 != (uint16_t)0U )

       {

         UnexpectedCtr1_u8 = (UnexpectedCtr1_u8 < MAX_UINT8) ? (UnexpectedCtr1_u8 + 1U) : (uint8_t)0U ;

       }else{ /* bit TRS30 is reset because Tx was successful */ }

    I monitor such counter UnexpectedCtr1_u8 which should continuously stay at the same value. This is unfortunately not the case!

    Even if it cannot become the solution for my customer, I have introduced at the end of the transmit function a polling on the transmit acknowledge flag. Thus the CPU does stay in a kind of idle state as long as the TA flag is not set. It seems to work pretty well during several hours: after a whole night, the Tx frequency of the message is equal to the expected one +/-5%. However simultaneously the counter UnexpectedCtr1_u8 does slightly increase! Why? How is it please possible to get the TA flag set at the end of the transmit function (§14.6.2.2.3) and then the TRS flag set at the beginning of the mailbox configuration function? At the end of the transmit function, shall the TRS flag be absolutely checked too (§14.6.2.2.4)?

    Maybe I could forget the unexpected events counter UnexpectedCtr1_u8, because the message Tx frequency is the expected one. However in order to let the CPU makes something else whereas the transmit function waits for the transmit acknowledge, the polling on the TA flag is reworked. At the end of the transmit function, if the TA flag is not set, a new variable stores the stage of the transmit function which is then left. About 1ms later, the transmit function is resumed and the TA flag is checked one more time in order to establish if the transmit function is successfully finished. Such implementation gives bad results: the issue of Tx message loss does occur again and the counter UnexpectedCtr1_u8 increases and rolls over. Thus unfortunately I am still not able to master successful transmission of several identifiers through the same mailbox!

     

    No, I have not tried to use time-out mechanism. You mean the time-out mechanism which is linked to registers CANTOC and CANTOS, don’t you? Moreover what is please the idea behind such time-out mechanism? Is it please a matter of a new tool for my analysis or is it a matter of a workaround!?

    Within the scope of the polling on the transmit acknowledge flag at the end of the transmit function, I have introduced GpioDataRegs.GPATOGGLE.bit.GPIO21=1 in the do-while loop. Because the error pattern is not always the same, I focus on the maximum toggling time of the GPIO21: 4.74 ms while the customer measurement file is endless replayed by my CANalyzer. It just gives me a rough idea of the time which can be require to get the transmit acknowledge. In my point of view it depends on the CAN bus load and race conditions.

  • Today evening and as described in the original question from Nico, I would state that I don’t know how to handle correctly the transmit acknowledge flag! If I understand correctly §14.7.5, then the transmit acknowledge flag can only be reset to false by the software which must write true. Thus after a successful transmission, if the software does not write true in a TA flag, then the TA flag stays true until the next power on reset. Is it please all right?

     

    Another transmit acknowledge question which is linked to §14.6.2.2.5a: if the TA flag is written to true while it is unexpectedly reset to false, what would please happen? What would happen in §14.6.2.2.5b? Would it be possible to read false or would it become an endless loop!?

    Actually I am not able at all to introduce the §14.6.2.2.5b in the source code of the transmit function! As soon as the following lines are enabled, there is a software reset which is probably triggered by the watchdog!

    #if 0

    do

    {

         eCanAShadow_s.CANTA.all = ECanaRegs.CANTA.all ;

         GpioDataRegs.GPATOGGLE.bit.GPIO21 = 1 ;   /* only for debug */

    } while( ( eCanAShadow_s.CANTA.all & (1UL<<MailboxIndex_e) ) != (uint32_t)0UL ) ;

    #endif

    How the §14.6.2.2.5b must be please implemented?

  • However simultaneously the counter UnexpectedCtr1_u8 does slightly increase! Why?

    It is hard for me to hazard a guess as to why. It depends on many factors such as: who else is transmitting, how often, whether there was any lost arbitration, error frames, what else is the application s/w doing etc. If all that the device is doing is to transmit repeatedly from a single mailbox the same message, then yes the numbers would never change. 

    How is it please possible to get the TA flag set at the end of the transmit function (§14.6.2.2.3)

    Sorry, I don’t understand the question. TA flag gets set automatically after a frame is successfully transmitted. 

    and then the TRS flag set at the beginning of the mailbox configuration function?

    No need to mess with the TRS bit during configuration. Only when you want to initiate transmission. 

    At the end of the transmit function, shall the TRS flag be absolutely checked too (§14.6.2.2.4)?

     

    Checking TA would suffice

    Moreover what is please the idea behind such time-out mechanism?

    That mechanism exists to alert the application if a message has not been transmitted (or received) within a predefined time. Also, I wanted to clarify that I used the word "race condition" in the context of your application s/w, not the module itself. 

    Thus after a successful transmission, if the software does not write true in a TA flag, then the TA flag stays true until the next power on reset. Is it please all right?

    Correct. 

    if the TA flag is written to true while it is unexpectedly reset to false, what would please happen?

    From the TRM: "If the CPU tries to reset the bit while the CAN tries to set it, the bit is set."

  • Here below I try to clarify my questions especially based on the following diagram:

     

     

    The first topic focus on the TRS flag. The unexpected events counter UnexpectedCtr1_u8 is implemented at the beginning of the mailbox configuration function IPCan_DrvConfigMailboxTxJ1939OnTheFly(). Of course such check of the TRS flag is usually not required. Its goal is to show what I cannot understand! At the end of the transmit function IPCan_DrvSendFrameExt(), the TA flag is checked: if the TA flag is set, then the function IPCan_DrvSendFrameExt() returns successful and the variable TxStdJ1939Msg_e jumps from case TxStdJ1939Msg_SendSafetyDataMsg to the case TxStdJ1939Msg_Finished which is an idle state. According to §14.6.2.2.4, the TRS flag is simultaneously reset by eCAN. Thus 1ms later during the next call of the function CPCan_AppTxStdJ1939Msg() it is possible to(re)configure the mailbox to send another message with a different identifier. At the beginning of the mailbox configuration function IPCan_DrvConfigMailboxTxJ1939OnTheFly(), how is it please possible to get sometimes the TRS flag set and thus the counter UnexpectedCtr1_u8 which does slightly increase !?

    Why do you please explain such effect with external factors (i.e. who else is transmitting, how often, whether there was any lost arbitration, error frames, what else is the application s/w doing etc)? As far as I understand the specification, if the TA.30 flag is set, then the TRS.30 flag is simultaneously reset. It is only a matter of the mailbox 30, isn’t it? The other peripherals can perform their tasks without disturbing the mailbox 30, can’t they?

     

    The second topic focus on the TA flag. According to §14.6.2.2.5 of the specification, the TA flag must be reset. The first step a requires to set TA.30 to true. The second step b requires to wait until read TA.30 is false. I could understand that the message is sent (i.e. that the TA flag is set) even if the second step b is not performed. However, I cannot understand why the implementation of the §14.6.2.2.5b generates a kind of endless loop, which is stopped by the watchdog. Why I get into big trouble while I just want to fulfill the TI specification to clear the transmit acknowledge before the next transmission !?

     

    Even if my question sound maybe strange, they clearly reflect my state of mind: my observations and my measurements do not match the eCAN specification especially for the flags TA and TRS! Thanks in advance for your answers and clarifications.


  • This is being handled offline.

  • Dear e2e community,

    I would like to update this post with the outcome of the offline discussion:

    After a debug session the following issue was found:
    While creating a shadow register for the TA register to then clear the TA flag of a specific mailbox, the compiler did not initialise the variable using all 0. This caused to unintentionally clear TA flags of other mailboxes. After forcing the compiler to initialize the variable with all zeros it could be ensured that only one TA-flag is being reset.

    If you are new to what a shadow register is I would like to refer to the "Reference Guide for the eCAN module" (LINK). In chapter 1.3.2.1 you find the Example 1-1 that touches this topic.

    BR,
    Nico