CAN MODULE Problem

Andrea Marcianesi38

Other Parts Discussed in Thread: TMS320F28374D, CONTROLSUITE, C2000WARE

Recently we change the CAN configuration of the TMS320F28374D, to use the 1Mbps baud rate using a 200Mhz peripheral clock.

Now, we used the Texas API to initialize the CAN peripheral both in CPU1 and CPU2. Moore exactly we called the function

CANBitRateSet(gst_Can[e_CanPort].u32_CanAddrBase, 200000000L, 1000000L);

In this way the Can rate is set automatically, using a clock prescaler of 19 (real 19+1) and the following parameter (red color)

tatic const uint16_t g_ui16CANBitValues[] =

{

0x1100, // TSEG2 2, TSEG1 2, SJW 1, Divide 5

0x1200, // TSEG2 2, TSEG1 3, SJW 1, Divide 6

0x2240, // TSEG2 3, TSEG1 3, SJW 2, Divide 7

0x2340, // TSEG2 3, TSEG1 4, SJW 2, Divide 8

0x3340, // TSEG2 4, TSEG1 4, SJW 2, Divide 9

0x3440, // TSEG2 4, TSEG1 5, SJW 2, Divide 10

0x3540, // TSEG2 4, TSEG1 6, SJW 2, Divide 11

0x3640, // TSEG2 4, TSEG1 7, SJW 2, Divide 12

0x3740 // TSEG2 4, TSEG1 8, SJW 2, Divide 13

};

Nevertheless, once the transmitting unit send 4 messages continuously, 1usec far from each other, sometimes the DSP loose one message.

The Can queue is 14 messages depth, so we are not explaining why we are facing this problem.

Can you help us please?

P.S. we have already tried to set the configuration register to 0x0700 instead of 0x3440, following the example in the manua, but the problem was still there. Sometimes the DSP don’t receive a message.

Thank you for your help,

Andrea Marcianesi.

over 3 years ago

0 Hareesh Janakiraman over 3 years ago

TI__Guru* 93985 points

Nevertheless, once the transmitting unit send 4 messages continuously, 1usec far from each other, sometimes the DSP loose one message.

Is the 28374D the receiver?

The Can queue is 14 messages depth, so we are not explaining why we are facing this problem.

Are you using the FIFO mode?

Please explain your setup. How many node? Who is transmitting to who?

0 Andrea Botarelli over 3 years ago in reply to Hareesh Janakiraman

Prodigy 70 points

Hi Hareesh,

Here follows our feedback:

CAN network is composed of the following nodes:
- 28374D core 1
- 28374D core 2
- Cortex M MCU from other vendor
28374D is using FIFO mode for CAN RX.
In the described sequence Cortex M is working as transmitter while 28374D core 1 is the receiver.

Please fell free to ask other details.

Best regards,

Andrea

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Botarelli

TI__Guru* 93985 points

In SPRZ412L, there is an item "During DCAN FIFO Mode, Received Messages May be Placed Out of Order in the FIFO Buffer". I wonder if what you are seeing is a manifestation of that. Could you provide some statistic? i.e. How many frames would elapse before you see a missing frame?

0 Andrea Botarelli over 3 years ago in reply to Hareesh Janakiraman

Prodigy 70 points

In our case the lost frame is actually not present in FIFO so it seems not an out of order issue.

During our tests we estimated a loss-rate of about 1-3% of sent messages.

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Botarelli

TI__Guru* 93985 points

Andrea,

I am a bit confused as to why you dealt at length with different CANBTR values in your first post. The bit-timing values have no correlation with "missing" frames. Any problem with the timing would manifest as error frames right away. Missing a frame among a hundred frames cannot be a bit-timing issue.

It appears you have only two nodes and I assume your bus length is fairly short and that the bus is properly terminated (Please download my Application report http://www.ti.com/lit/sprace5 and look at the Debug tips provided).

I presume you have a CAN bus analyzer connected and that keeps track of the number of frames transmitted. In other words, you are certain that the "missing" frames were indeed transmitted on the bus.

Do you use acceptance mask filtering (and hence accept multiple message IDs)?
or is it the same MSGID that is transmitted over and over again?
Do you change the MSGID of the receive mailbox or is it the same throughout?
Does the problem vanish if you don't use FIFO mode?
What is your CANBTR value?
What clock source do you use?

0 Andrea Botarelli over 3 years ago in reply to Hareesh Janakiraman

Prodigy 70 points

Hareesh,

I will try to anticipate some explanations and my colleagues working on the issue will add more details asap:

You said we have only 2 nodes but I have already specified that they are actually 3 (i.e. 1 CAN node for each DSP core + 1 CAN node for the Cortex M).
We are pretty sure that the lost CAN frame was correctly propagated through the line since we monitored digital signal on CAN RX pin of the receiver node (28374D core 1). The following picture shows the lost frame with ID 0x1D210008 that is correctly decoded by our bus analyzer as well as through external sniffer attached to bus (P-CAN).

The frame arriving at CAN RX seems correct but it is still missing in the RX FIFO and that's the reason why we supposed wrong timing configuration for the peripheral (i.e. as you know, bad sizing of CAN propagation time-slots could lead to wrong bit sampling).

Anyway this is just our assumption.

0 Andrea Marcianesi38 over 3 years ago in reply to Hareesh Janakiraman

Intellectual 355 points

Good morning Hareesh,

Attached you will find a text file with the configuration of our CAN, for both CPU1 and CPU2.

This, will answer to most of your questions.

Some clarifications:

1) We did not try to remove fifo mode: this is a test which require some effort, but at the moment we don't have time.

2) The clock source is an external 20Mhz oscillator which we used in a lot of different other project, in the same configuration, so I will exclude the problem is that.

Thank you very much for your help,

Andrea Marcianesi.

7357.CANerror.txt

#define CANCLK						                             ( 200000000L)			// [Hz]
#define CAN_BBR_ORION                                  (  1000000L )
#define CAN_RX_FIFO_LEN 		                                    (14)
#define CAN_RX_START_ADDR		                                     (2)		//Primo MsgObj della FIFO
#define CAN_RX_START_ADDR2		 (CAN_RX_START_ADDR + CAN_RX_FIFO_LEN)		//Primo MsgObj della FIFO
#define TX_MAILBOX			                                         (1)
#define CAN_BUS_OFF_RECOVERY_TIME                          (200000u) /*! expressed in number of sys clock cycles */

typedef enum
{
	e_CAN_A,
	e_CAN_B
}e_CAN;

typedef enum
{
	dataMw_log_device_node_id_inverter_0_connectivity_0 = 0,
	dataMw_log_device_node_id_inverter_0_supervisor_0 = 1,
	dataMw_log_device_node_id_inverter_0_booster_0 = 2,
	dataMw_log_device_node_id_inverter_0_inverter_0 = 3,

	dataMw_log_device_node_id_NUM
}

typedef struct
{
	uint32_t u32_CanAddrBase;
	tCANMsgObject *st_RxMsgObj;
	//uint16_t *pu16_MsgDataBuffer;
} st_CAN;

#define CANCLK						                             ( 200000000L)			// [Hz]
#define CAN_BBR_ORION                                  (  1000000L )
#define CAN_RX_FIFO_LEN 		                                    (14)
#define CAN_RX_START_ADDR		                                     (2)		//Primo MsgObj della FIFO
#define CAN_RX_START_ADDR2		 (CAN_RX_START_ADDR + CAN_RX_FIFO_LEN)		//Primo MsgObj della FIFO
#define TX_MAILBOX			                                         (1)
#define CAN_BUS_OFF_RECOVERY_TIME                          (200000u) /*! expressed in number of sys clock cycles */


st_CAN gst_Can[2] = { {CANA_BASE, &st_RXCANMessage1},
		              {CANB_BASE, &st_RXCANMessage2}};

//During hardware init we call this function to init the CAN using the driverlib Texas function.

CAN_Init_Orion (e_CAN_B, CANCLK, CAN_BBR_ORION, dataMw_log_device_node_id_inverter_0_booster_0);

//The body of the function is the following
void CAN_Init_Orion(e_CAN e_CanPort, uint32_t  u32_PerheralClock, uint32_t  u32_BaudRate, dataMw_log_device_node_id_t e_destination)
{
    EALLOW;
    if (e_CanPort == e_CAN_A)
    {
        CpuSysRegs.PCLKCR10.bit.CAN_A = 1;
    }
    else
    {
        CpuSysRegs.PCLKCR10.bit.CAN_B = 1;
    }
    EDIS;

    dataMw_powercom_can_frame_header_t mob_filter = {
                 .ID = 0,
                 //.destination = (1 << dataMw_log_device_node_id_inverter_0_inverter_0),
                 .destination = (1 << e_destination),
                 .priority = 0
    };

    CANDisable(gst_Can[e_CanPort].u32_CanAddrBase);
    CANInit(gst_Can[e_CanPort].u32_CanAddrBase);
    // Setup CAN to be clocked off the M3/Master subsystem clock
    CANClkSourceSelect(gst_Can[e_CanPort].u32_CanAddrBase, 0);
    // Set up the bit rate for the CAN bus.  This function sets up the CAN
    // bus timing for a nominal configuration.  You can achieve more control
    // over the CAN bus timing by using the function CANBitTimingSet() instead
    // of this one, if needed.
    // In this example, the CAN bus is set to 500 kHz.  In the function below,
    // the call to SysCtlClockGet() is used to determine the clock rate that
    // is used for clocking the CAN peripheral.  This can be replaced with a
    // fixed value if you know the value of the system clock, saving the extra
    // function call.  For some parts, the CAN peripheral is clocked by a fixed
    // 8 MHz regardless of the system clock in which case the call to
    // SysCtlClockGet() should be replaced with 8000000.  Consult the data
    // sheet for more information about CAN peripheral clocking.
    CANBitRateSet(gst_Can[e_CanPort].u32_CanAddrBase, u32_PerheralClock, u32_BaudRate);

    CANEnable(gst_Can[e_CanPort].u32_CanAddrBase);

    /*! BUS OFF ENABLE */
    CANAutoBusOffEnable(gst_Can[e_CanPort].u32_CanAddrBase, CAN_BUS_OFF_RECOVERY_TIME);

    // RX message FIFO
    gst_Can[e_CanPort].st_RxMsgObj->ui32MsgID = mob_filter.ID;
    gst_Can[e_CanPort].st_RxMsgObj->ui32MsgIDMask = mob_filter.ID;//u32_MsgIdMask;
    gst_Can[e_CanPort].st_RxMsgObj->ui32Flags = MSG_OBJ_FIFO | MSG_OBJ_EXTENDED_ID | MSG_OBJ_USE_EXT_FILTER;
    gst_Can[e_CanPort].st_RxMsgObj->ui32MsgLen = 8;
    //gst_Can[e_CanPort].st_RxMsgObj->pucMsgData = (unsigned char*)gst_Can[e_CanPort].pu16_MsgDataBuffer;

    for (uint32_t i=0u; i < (CAN_RX_FIFO_LEN-1); i++ )
    {
        CANMessageSet(gst_Can[e_CanPort].u32_CanAddrBase, CAN_RX_START_ADDR + i, gst_Can[e_CanPort].st_RxMsgObj, MSG_OBJ_TYPE_RX);
    }

    //Configuro l'ultimo elemento della FIFO
    gst_Can[e_CanPort].st_RxMsgObj->ui32Flags = MSG_OBJ_EXTENDED_ID | MSG_OBJ_USE_EXT_FILTER;
    CANMessageSet(gst_Can[e_CanPort].u32_CanAddrBase, CAN_RX_START_ADDR + (CAN_RX_FIFO_LEN - 1), gst_Can[e_CanPort].st_RxMsgObj, MSG_OBJ_TYPE_RX);

}

//As you can see we use the texas driverlib functions except for the function CANAutoBusOffEnable, who's body is the following
void CANAutoBusOffEnable(uint32_t ui32Base, uint32_t uiNumSysClockRecovery)
{
    // Check the arguments.
    ASSERT(CANBaseValid(ui32Base));

    // Clear the init bit in the control register.
    HWREGH(ui32Base + CAN_O_CTL) = HWREGH(ui32Base + CAN_O_CTL) | CAN_CTL_ABO;
    HWREGH(ui32Base + CAN_O_ABOTR) = uiNumSysClockRecovery;
}

//Next we poll the can bus maximum every 600usec, which is actually the worst 
//case for the main cycle period.

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Marcianesi38

TI__Guru* 93985 points

I regret I am unable to glean answers to my questions from looking at the code snippet you sent. Please provide answers for 1,2,3 & 5. You could answer 5 by simply looking at the CANBTR register value in CCS.

0 Andrea Marcianesi38 over 3 years ago in reply to Hareesh Janakiraman

Intellectual 355 points

Dear Hareesh,

unfortunately to set the value of all CAN register, we used the Texas instruments Driver LIB, so basically we don’t set directly the values of the register. Moreover, in the CCS 8.1 we are not able to see the value of these register in the debug window. So the only way we have to see the value of these register is defining a global variable, or reading the code.

Anyway, these are the answer of your questions:

Do you use acceptance mask filtering (and hence accept multiple message IDs)?

NO. We set the msg.ID equal tro 0.

or is it the same MSGID that is transmitted over and over again?

On the CAN the MSGID is defferent.

Do you change the MSGID of the receive mailbox or is it the same throughout?

We don’t change the MSGID of the receive mailbox

Does the problem vanish if you don't use FIFO mode?

We did,t tried due to the fact that this require to change the code and in this moment we don’t have the resources to do that.

What is your CANBTR value?

We used the TI Driver LIb function CANBitRateSet to set the CANBTR r3egister, using the following calling

CANBitRateSet(gst_Can[e_CanPort].u32_CanAddrBase, 200000000L, 1000000L);

Reading the TI function, the value that should be set is the following

0x3440, // TSEG2 4, TSEG1 5, SJW 2, Divide 10

As in this way the Can rate is set automatically, using a clock prescaler of 19 (real 19+1) taking the value in the following array

static const uint16_t g_ui16CANBitValues[] =

{

0x1100, // TSEG2 2, TSEG1 2, SJW 1, Divide 5

0x1200, // TSEG2 2, TSEG1 3, SJW 1, Divide 6

0x2240, // TSEG2 3, TSEG1 3, SJW 2, Divide 7

0x2340, // TSEG2 3, TSEG1 4, SJW 2, Divide 8

0x3340, // TSEG2 4, TSEG1 4, SJW 2, Divide 9

0x3440, // TSEG2 4, TSEG1 5, SJW 2, Divide 10

0x3540, // TSEG2 4, TSEG1 6, SJW 2, Divide 11

0x3640, // TSEG2 4, TSEG1 7, SJW 2, Divide 12

0x3740 // TSEG2 4, TSEG1 8, SJW 2, Divide 13

};

What clock source do you use?

Is an external 20Mhz oscillator, already used in several other our project. Internally we use a 200MHz source for the CAN module.

Regards,

Andrea Marcianesi.

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Marcianesi38

TI__Guru* 93985 points

Sorry I don’t understand. You have configured a 14-level FIFO using 14 message objects. You say you don't use filtering and that you don’t change the MSGID of the receive mailbox. I presume the MSGID of the receive mailbox is 0x1D210008 and never changes. But you also say " On the CAN the MSGID is different". What do you mean by this?

CANBTR register is a 32-bit register. You can easily look at the value fo the register using the View-->Registers option.

0 Andrea Marcianesi38 over 3 years ago in reply to Hareesh Janakiraman

Intellectual 355 points

"Sorry I don’t understand. You have configured a 14-level FIFO using 14 message objects. You say you don't use filtering and that you don’t change the MSGID of the receive mailbox. I presume the MSGID of the receive mailbox is 0x1D210008 and never changes. But you also say " On the CAN the MSGID is different". What do you mean by this? "

I mean the Message ID transmitted by the Cortex is not fixed. It send over the bus several messages with different ID. The DSP, has not a mask ID so ir receive all Those messages. It has no acceptance mask and it is set to receive all the m,essages.

" You can easily look at the value fo the register using the View-->Registers option."

Actually in CCS 8.1 that option doesn't work.

By the way, the CANBTR is set to 0x3440.

Regards,

Andrea MArcianesi.

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Marcianesi38

TI__Guru* 93985 points

OK. I got confused when you said you don't use acceptance mask filtering. This is the case when the MSGID of the Receive MBX is fixed and it accepts only that ID. In your case, you do use the acceptance mask. However, you have configured all bits to be "don't care", so that any MSGID is received.

Could you describe how message reception is handled? I presume you generate an interrupt when the FIFO is filled?

0 Hareesh Janakiraman over 3 years ago in reply to Hareesh Janakiraman

TI__Guru* 93985 points

More questions:

Have you noticed any bits (related to errors) being set in the CANES register?
Is there a possibility that the "lost" message could have been overwritten by a new message? i.e. the message was indeed received but was overwritten before it could be read? Do you monitor the MsgLst bit?
Do you see the issue if you reduce the bit-rate?

0 Andrea Marcianesi38 over 3 years ago in reply to Hareesh Janakiraman

Intellectual 355 points

Hi Hareesh,

in the following, my answers,

Could you describe how message reception is handled?

We poll the fifo in the main cycle, every 600usec in the worst case (in most of the case we are able to poll the fifo faster).

During this process we empty all the mailboxes.

I presume you generate an interrupt when the FIFO is filled?

No, we poll the fifo in the main cycloe.

Have you noticed any bits (related to errors) being set in the CANES register?

No we didn't. The message is not in the fifo, that's it. The fact that the register view in the CCS 8.1 doesn,t work, doesn't help us.

Is there a possibility that the "lost" message could have been overwritten by a new message? i.e. the message was indeed received but was overwritten before it could be read?

Teoretically is not possible because to fill the 14 mailboxes fifo @1Mbps, eith a poll cycle of maximum 600usec should not be possible. Moreover the Cortex which send the messages, stops immidiatly, if the DSP doesn't answer so we froze the fifo in the DSP and the fifo, when we loose a message is not full.

Do you monitor the MsgLst bit?

We poll for the CAN_NDAT_21 register bits. Untilo there is a bit set in this register, we read the messages in the corresponding mailbox.

Do you see the issue if you reduce the bit-rate?

We didn't try. We should lower the bit rate on both DSP and Cortex application so we didn't try.

Andrea

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Marcianesi38

TI__Guru* 93985 points

Andrea,

Nevertheless, once the transmitting unit send 4 messages continuously, 1 usec far from each other, sometimes the DSP loose one message.

Can you please explain "1 usec far from each other"? Is the gap between each frame 1 uS? You are already transmitting at 1 Mbps. There is at least a 11-bit recessive period between each frame. Where does this 1 uS come in? Does this mean your bus load is 100%?

We poll the fifo in the main cycle, every 600usec in the worst case (in most of the case we are able to poll the fifo faster).

Exactly how do you poll? Assuming an average length of 120 bits/frame, it is about 1680 bits for 14 frames, so we are looking at 1.68 ms for filling up the buffer (these are very approximate calculations; I am ignoring the effect of stuff bits, IFS etc). You mention you poll the FIFO every 600 uS "worst case". What does "worst case" mean? Do you have a CPU Timer interrupt generating a polling request every 600 uS?

Teoretically is not possible because to fill the 14 mailboxes fifo @1Mbps, eith a poll cycle of maximum 600usec should not be possible.

Based on my calculations above, it would take at least 1.68 ms for filling up the FIFO. Since you poll the FIFO every 600 uS, are you saying there is no chance for a FIFO overflow? If so, am I correct to say that the FIFO never gets full because you poll every 600us and empty it?

Moreover the Cortex which send the messages, stops immidiatly, if the DSP doesn't answer so we froze the fifo in the DSP and the fifo, when we loose a message is not full.

Please clarify "doesn’t answer". Are you referring to the ACK generated by the DCAN? i.e. if the MCU does not ACK a frame from Cortex, the Cortex stops sending messages immediately?

We poll for the CAN_NDAT_21 register bits. Untilo there is a bit set in this register, we read the messages in the corresponding mailbox.

Sorry I am unable to understand what you are trying to say here. Please clarify.

0 Andrea Botarelli over 3 years ago in reply to Hareesh Janakiraman

Prodigy 70 points

Hi,

I am iterating on a few points:

Nevertheless, once the transmitting unit send 4 messages continuously, 1 usec far from each other, sometimes the DSP loose one message. >> This is our mistake: inter-frame time in the faulty sequence is actually about 14us (NB. This is actually the inter-frame time before the lost frame). Please check oscilloscope snapshot in my previous post.
Please clarify "doesn’t answer". Are you referring to the ACK generated by the DCAN? i.e. if the MCU does not ACK a frame from Cortex, the Cortex stops sending messages immediately? >> No it is a bit more complex scenario: Cortex CAN client actually sends a sequence of 4 frames forming a single protocol packet, if 1 or more packet frames are lost on CAN RX, DSP discards the packet and does not send a reply to the Cortex.

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Botarelli

TI__Guru* 93985 points

Are you using the IF3 register set?

0 Andrea Marcianesi38 over 3 years ago in reply to Hareesh Janakiraman

Intellectual 355 points

No Hareesh, we don't use IF3.

What does "worst case" mean? Do you have a CPU Timer interrupt generating a polling request every 600 uS?

No, we pass in the same main point every 600us, so the function which poll the fifo in the main is executed every 600usec.

If so, am I correct to say that the FIFO never gets full because you poll every 600us and empty it?

yes

Exactly how do you poll?

We poll for the CAN_NDAT_21 register bits. Untilo there is a bit set in this register, we read the messages in the corresponding mailbox.

Andrea

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Marcianesi38

TI__Guru* 93985 points

Andrea,

It is better to take the discussion offline. I already sent you a friendship request via e2e. Please accept that and let us work toward a call/Webex.

0 Andrea Botarelli over 3 years ago in reply to Hareesh Janakiraman

Prodigy 70 points

Hi Hareesh,

Following up the call on webex, I would like to clarify some points:

Since you supposed there is not ACK from DSP node when frame is lost, can you confirm that ACK is sent by receiver node regardless of acceptance mask set on the MOBs? In other words: if a valid CAN frame is received but its message ID does not match the acceptance mask, ACK is sent on the bus?
Can you please specify register where MsgLost flag can be monitored?

Thanks in advance.

Andrea

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Botarelli

TI__Guru* 93985 points

Andrea,

Since you supposed there is not ACK from DSP node when frame is lost,

Yes, it is conceivable that there is not an ACK from the targeted CAN node (for whatever reason). However, bear in mind that the ACK can always come from the other CAN node and/or the CAN bus analyzer. If no node provides an ACK, the transmitting node will keep re-transmitting that frame forever.

can you confirm that ACK is sent by receiver node regardless of acceptance mask set on the MOBs?

Yes. I have explained this clearly in section 3.1 of my application report (http://www.ti.com/lit/sprace5)

In other words: if a valid CAN frame is received but its message ID does not match the acceptance mask, ACK is sent on the bus?

Yes.

Can you please specify register where MsgLost flag can be monitored?

This can be monitored using the MsgLst bit of the mailbox via the CAN_IF1MCTL Register. However, I really doubt if the frame is being overwritten. Your FIFO is 14 deep and it seldom gets filled beyond 4, before it is read and emptied. May be you could try this: Toggle a GPIO pin in the routine that reads the FIFO. With a scope, measure the longest time interval between the reads.

0 Andrea Marcianesi38 over 3 years ago in reply to Hareesh Janakiraman

Intellectual 355 points

Godd evening Hareesh,

here some result of some test we done in iTALY.

In the test we do not use the CAN bus for the Booster (CPU2), in such a way to have just the Cortex and the Inverter (CPU1) communicating in the CAN bus. In this way, only the inverter can give the ACK to a message from the Cortex.

In the CPU1, we rise a pin, before entering in the fifo polling routine, and then we lower the pin, when we exit from the routine and we empty the fifo.

Attached there are the results: the blue line represent the for messages, sent by the Cortex on the CAN bus: 3 messages have a payload of 8 bytes, the last message has a payload of 6 bytes.

The yellow line represent the polling of the can fifo: as you can see we poll the fifo fast enough, but unfortunately the problem s still there, we lose a message on the FIFO, so the DSP doesn’t answer to the sequence of four messages, the Cortex goes in time-out and stop sending messages, and we can froze the fifo state on the DSP so we can analyze it.

More informations:

when we have the problem, the register CAN_BTR at 0x48004 is set to 0x3453;

the register CAN_IF1MCTL at 0x4810C is set to 0x0186

the register CAN_ES at location 0x48004 after the error is 0x0018, after first reading, next became 0x0007

the CAN_ERRC at location 0x48008 after the error s 0x0000

Hopfully tomorrow we will try lowering the baud rate.

0 Andrea Marcianesi38 over 3 years ago in reply to Andrea Marcianesi38

Intellectual 355 points

Hi Hareesh,

I tried also to lower the can rate to 1Msps, but the problem is still there, we lose a message from Cortex.

But for sure the problem is not the ack bit: let me explain what happen.

If we poll the fifo, while we are receiving a sequence of messages, like in the situation of the picture I Attached yesterday, we have the problem losing one message.

If we receive all 4 messages and then we poll the fifo, then we never have the problem. I make a DSP CPU1 Inverter version, which poll the fifo every 3 msec.

Since the Cortex send this 4 messages every 500msec, @1Mbps, basically we poll the fifo not while we are receiving. IN this way we never

lost a message.

The situation is depicted in the picture below

Clearly, this is not the solution, it is just a test which demonstrate that we have the error when we poll the fifo, while we are receiving messages. Basically in the receiving routine. It is not a physical problem or init problem.

I attached our receiving routine(see code.c), and my supposing is that we have a problem when we call the function CANMessageGet();, which is part of the driver_lib of the Texas instruments. If you can read the code, I left some comment to let you understand the situation.

So the question is, what happen if we receive a new message, while we are executing the function CANMessageGet ?

I think there could be some critical race between bits of receiving flag and messages which can lead to lose a message.

This is a situation you can reproduce easily by yourself: continuously send a frame of messages and call the CANMessageGet driver_lib function, while you are receiving a frame of messages. You will see that you can loose some fifo message.

I think this could address as the poll of the fifo and his emptying seems to need to be atomic.

Regards,

Andrea Marcianesi.

In the following, the code I would like to attach, but I don't see, so I reported below

//This is the function we call to poll the receiving fifo in the main
void Orion_Can_TxRx(void)
{
//uint16_t *buffer_tmp;
//This is the while where we empty the fifo, as you can see we call The funtion CAN_Rx_Orion(), until the function u16_CAN_RxAvailable() return true
//The function CAN_Rx_Orion() contain a "return ((CANStatusGet(gst_Can[e_CanPort].u32_CanAddrBase, CAN_STS_NEWDAT) & RX_MAILBOX_FLAGS) != 0);".
//So basically this is the while hwere we empty the fifo, as we stay here until there are received messages to analyze.
while (u16_CAN_RxAvailable(e_CAN_A))
{

CAN_Rx_Orion(e_CAN_A, &msg_id, msg_buffer, &msg_len);

dataMw_powercom_can_frame_t can_frame = {
.id.ID = msg_id,
.payload.p = (void *)msg_buffer,
.payload.size = msg_len
};

can_frame.payload.size = (msg_len >> 1);
dataMw_powercom_rx_can_frame((st_dataMw_powercom_t*)dataMw_powercom_instance_get(), &can_frame);
}

if (u16_CAN_TxAvailable(e_CAN_A) == true)
{

// ******************************************
// ************* Svuoto la coda *************
// ******************************************

dataMw_powercom_can_frame_t *can_frame = dataMw_powercom_tx_can_frame((st_dataMw_powercom_t*)dataMw_powercom_instance_get());
if(can_frame != NULL)
{
//buffer_tmp = (uint16_t*)can_frame->payload.p;
msg_id = can_frame->id.ID;
msg_len = can_frame->payload.size;

CAN_Tx_Orion(e_CAN_A, msg_id, (uint16_t*)can_frame->payload.p, (msg_len << 1));
}

}
}

//The problem in my opinion is inside the function CAN_Rx_Orion and more exactly in the function CANMessageGet().
//Somehow if there is a new message coming, while we are executing the function CANMessageGet() we loose this message.

void CAN_Rx_Orion(e_CAN e_CanPort, uint32_t *pu32_DestMsgId, uint16_t *pu16_DestDataBuffer, uint32_t *pu32_MsgLen)
{
//#ifdef CPU1
uint32_t u32_RxFlags;
uint16_t i;
uint16_t u16_Mailbox;
uint16_t pu16_TmpRxBuffer[8];

u32_RxFlags = CANStatusGet(gst_Can[e_CanPort].u32_CanAddrBase, CAN_STS_NEWDAT) >> 1;
u32_RxFlagBuffer[index]=u32_RxFlags;
if (index++ == 20)
index = 0;
for (i = 0; i < CAN_RX_FIFO_LEN; i++)
{
u16_Mailbox = CAN_RX_START_ADDR + i;
if ((u32_RxFlags & (1 << i)) != 0)
break; //Primo elemente valido nella FIFO
}
gst_Can[e_CanPort].st_RxMsgObj->pucMsgData = (unsigned char*)pu16_TmpRxBuffer;
CANMessageGet(gst_Can[e_CanPort].u32_CanAddrBase, u16_Mailbox, gst_Can[e_CanPort].st_RxMsgObj, true);

/*
* Rx buffer is word aligned; we need to convert can msg in a word
* aligned array
* */
__byte((int*)&pu16_DestDataBuffer[0], 0 ) = pu16_TmpRxBuffer[0];
__byte((int*)&pu16_DestDataBuffer[0], 1 ) = pu16_TmpRxBuffer[1];
__byte((int*)&pu16_DestDataBuffer[1], 0 ) = pu16_TmpRxBuffer[2];
__byte((int*)&pu16_DestDataBuffer[1], 1 ) = pu16_TmpRxBuffer[3];
__byte((int*)&pu16_DestDataBuffer[2], 0 ) = pu16_TmpRxBuffer[4];
__byte((int*)&pu16_DestDataBuffer[2], 1 ) = pu16_TmpRxBuffer[5];
__byte((int*)&pu16_DestDataBuffer[3], 0 ) = pu16_TmpRxBuffer[6];
__byte((int*)&pu16_DestDataBuffer[3], 1 ) = pu16_TmpRxBuffer[7];

*pu32_DestMsgId = gst_Can[e_CanPort].st_RxMsgObj->ui32MsgID;
*pu32_MsgLen = gst_Can[e_CanPort].st_RxMsgObj->ui32MsgLen;
//#endif
}

7144.code.c

//This is the function we call to poll the receiving fifo in the main
void Orion_Can_TxRx(void)
{
//uint16_t *buffer_tmp;
    //This is the while where we empty the fifo, as you can see we call The funtion  CAN_Rx_Orion(), until the function u16_CAN_RxAvailable() return true
    //The function CAN_Rx_Orion() contain a "return ((CANStatusGet(gst_Can[e_CanPort].u32_CanAddrBase, CAN_STS_NEWDAT) & RX_MAILBOX_FLAGS) != 0);".
    //So basically this is the while hwere we empty the fifo, as we stay here until there are received messages to analyze.
    while (u16_CAN_RxAvailable(e_CAN_A))
    {    	

        CAN_Rx_Orion(e_CAN_A, &msg_id, msg_buffer, &msg_len);

        dataMw_powercom_can_frame_t can_frame = {
                .id.ID = msg_id,
                .payload.p = (void *)msg_buffer,
                .payload.size = msg_len
        };

        can_frame.payload.size = (msg_len >> 1);
        dataMw_powercom_rx_can_frame((st_dataMw_powercom_t*)dataMw_powercom_instance_get(), &can_frame);
    }

    if (u16_CAN_TxAvailable(e_CAN_A) == true)
    {

        // ******************************************
        // ************* Svuoto la coda *************
        // ******************************************

        dataMw_powercom_can_frame_t *can_frame = dataMw_powercom_tx_can_frame((st_dataMw_powercom_t*)dataMw_powercom_instance_get());
        if(can_frame != NULL)
        {
            //buffer_tmp = (uint16_t*)can_frame->payload.p;
            msg_id = can_frame->id.ID;
            msg_len = can_frame->payload.size;


            CAN_Tx_Orion(e_CAN_A, msg_id, (uint16_t*)can_frame->payload.p, (msg_len << 1));
        }


    }
}

//The problem in my opinion is inside the function CAN_Rx_Orion and more exactly in the function CANMessageGet().
//Somehow if there is a new message coming, while we are executing the function CANMessageGet() we loose this message.

void CAN_Rx_Orion(e_CAN e_CanPort, uint32_t  *pu32_DestMsgId, uint16_t *pu16_DestDataBuffer, uint32_t *pu32_MsgLen)
{
//#ifdef CPU1
uint32_t  u32_RxFlags;
uint16_t i;
uint16_t u16_Mailbox;
uint16_t pu16_TmpRxBuffer[8];

    u32_RxFlags = CANStatusGet(gst_Can[e_CanPort].u32_CanAddrBase, CAN_STS_NEWDAT) >> 1;
    u32_RxFlagBuffer[index]=u32_RxFlags;
    if (index++ == 20)
        index = 0;
    for (i = 0; i < CAN_RX_FIFO_LEN; i++)
    {
        u16_Mailbox = CAN_RX_START_ADDR + i;
        if ((u32_RxFlags & (1 << i)) != 0)
            break;      //Primo elemente valido nella FIFO
    }
    gst_Can[e_CanPort].st_RxMsgObj->pucMsgData = (unsigned char*)pu16_TmpRxBuffer;
    CANMessageGet(gst_Can[e_CanPort].u32_CanAddrBase, u16_Mailbox, gst_Can[e_CanPort].st_RxMsgObj, true);

    /*
     * Rx buffer is word aligned; we need to convert can msg in a word
     * aligned array
     * */
    __byte((int*)&pu16_DestDataBuffer[0], 0 ) = pu16_TmpRxBuffer[0];
    __byte((int*)&pu16_DestDataBuffer[0], 1 ) = pu16_TmpRxBuffer[1];
    __byte((int*)&pu16_DestDataBuffer[1], 0 ) = pu16_TmpRxBuffer[2];
    __byte((int*)&pu16_DestDataBuffer[1], 1 ) = pu16_TmpRxBuffer[3];
    __byte((int*)&pu16_DestDataBuffer[2], 0 ) = pu16_TmpRxBuffer[4];
    __byte((int*)&pu16_DestDataBuffer[2], 1 ) = pu16_TmpRxBuffer[5];
    __byte((int*)&pu16_DestDataBuffer[3], 0 ) = pu16_TmpRxBuffer[6];
    __byte((int*)&pu16_DestDataBuffer[3], 1 ) = pu16_TmpRxBuffer[7];


    *pu32_DestMsgId = gst_Can[e_CanPort].st_RxMsgObj->ui32MsgID;
    *pu32_MsgLen = gst_Can[e_CanPort].st_RxMsgObj->ui32MsgLen;
//#endif
}

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Marcianesi38

TI__Guru* 93985 points

In the CPU1, we rise a pin, before entering in the fifo polling routine, and then we lower the pin, when we exit from the routine and we empty the fifo.

I presume the yellow signal in the first waveform is the GPIO pin that is toggled before entering the FIFO and exiting the FIFO. Not only is the width of the pulse different in the capture (I see 3 different pulse widths), I see the toggles occurring at different rates, the smallest interval being <200 uS.

I tried also to lower the can rate to 1Msps, but the problem is still there, we lose a message from Cortex.

I presume you meant to say 500 kbps, not 1 Msps.

If we poll the fifo, while we are receiving a sequence of messages, like in the situation of the picture I Attached yesterday, we have the problem losing one message.

This is an interesting/important observation.

Since the Cortex send this 4 messages every 500msec, @1Mbps, basically we poll the fifo not while we are receiving. IN this way we never lost a message.

Since communication from Cortex happens asynchronously, how do you ensure that your code absolutely does not poll the FIFO when the transmission is ongoing?

Also, you were meaning to replace polling with interrupts. Assuming you generate an interrupt when the last mailbox of the FIFO is filled and service it before the commencement of the next transmission from cortex, wouldn’t that avoid this situation?

0 Andrea Marcianesi38 over 3 years ago in reply to Hareesh Janakiraman

Intellectual 355 points

Good morning Hareesh,

in the following my answer.

"I presume the yellow signal in the first waveform is the GPIO pin that is toggled before entering the FIFO and exiting the FIFO. Not only is the width of the pulse different in the capture (I see 3 different pulse widths), I see the toggles occurring at different rates, the smallest interval being <200 uS. "

yes, I wrote you in my previous e-mail: "The yellow line represent the polling of the can fifo: as you can see we poll the fifo fast enough, but unfortunately the problem s still there, we lose a message on the FIFO, so the DSP doesn’t answer to the sequence of four messages, the Cortex goes in time-out and stop sending messages, and we can froze the fifo state on the DSP so we can analyze it."

I presume you meant to say 500 kbps, not 1 Msps.

Yes.

Since communication from Cortex happens asynchronously, how do you ensure that your code absolutely does not poll the FIFO when the transmission is ongoing?

Of course, I'm not sure, but there is no possibility that I will poll the fifo 2 times while there is a stream of 4 messages 500usec long. And in this case we never had a fault for all day. In the first picture I attached you can see that we poll the fifo at least 2 times during the messages streaming and we have the problem after few seconds. This in my opinion would mean that the problem is in the way we manage the fifo, not ion CAN baud rate configuration, or physical layer error or whatever.

We decided to try to poll the fifo inside an interrupt of the timer. Actually we have a 10Khz Timer interrupt which is running in CPU1, but this was in the case we weren’t sure of the rate of the polling routing in the main cycle. As you can see in the first picture there is no need to do this, because in the case of the error, I poll the FIFO fast enough: the fifo length in the software never overcome 3 messages received so I'm sure that I never had an ever written message. Moreover the physical layer doesn't reveal an error because I wrote you the error messages and they are clean. So form us the test to poll the fifo in a interrupt is meaningless.

We could try to to generate an interrupt of fifo full, but I see a lot of problem doing that:

1) The DSP will answer only when the fifo will be full: we have protocol messages long 1 CAN messages, 2 CAN Messages, 4 CAN messages, and it is difficult to decide what would be the right length. Suppose we fix the length to 4 can messages, The DSP will not answer until the fifo is 4, and if the cortex transmit a protocol message 2 CAN messages long the DSP will no answer?

2) Introducing a communication interrupt for us is impossible: I already told you that tha interrupt is dedicated to the control which need to be accurately scheduled. We never use an interrupt for communication purpose.

3) I ask you the same question you ask me: since communication from Cortex happens asynchronously, how do you ensure that your code absolutely you services the interrupt before the commencement of the next transmission from cortex? There is no handshake between Cortex and DSP.

4) Suppose you set everything fine and the software works: what next? You can understand where is the problem? I already provide to you a simple configuration enabling us to understand where is the problem. If you think there is no possibility to solve this problem with the actual solution (we already have 4 product using this solution), we have to change the protocol.

I would like to avoid going through the understanding of the function CANMessageGet in the Texas Instruments driver_lib, but for me will be the next step.

I look the last driver lib release of Texas instruments to compare the CANMessageGet software that we used, with the most recent driver_lib release. What surprise me , was that when we start using driver_lib, the software was under the folder

C:\ti\controlSUITE\device_support\F2837xD\v210\F2837xD_common\driverlib,

now it is under

C:\ti\c2000\C2000Ware_3_02_00_00\device_support\f2837xd\common\deprecated\driverlib

The implementation of the function CANMessageGet() didn’t change, but could you explain please what does “deprecated” means?

Thank you for your support,

Regards,

Andrea Marcianesi.

P.S. Just another question: did you understand how to read the CAN register in the register view or in the memory map of the debugger of CCS 8.3.1? I would like to know, thank you.

0 Christopher Chiarella over 3 years ago in reply to Andrea Marcianesi38

TI__Mastermind 30255 points

Andrea

Regarding your question on the deprecated driverlib, this means we are no longer performing updates to those drivers and they aren't recommended for any new development. We have a new driver library which we are supporting going forward located at: C:\ti\c2000\C2000Ware_3_02_00_00\driverlib\f2837xd

Best regards

Chris

0 Hareesh Janakiraman over 3 years ago in reply to Andrea Marcianesi38

TI__Guru* 93985 points

Andrea,

We are looking at the Driverlib function in closer detail. We can discuss the below in the next call, since that would be faster.

Why are the pulse widths different in the first image?

1st pulse - This is extremely narrow. I take it that this is because the FIFO was empty, so there was nothing to read and the code exited the FIFO reading routine very quickly.

2nd and 3rd pulses - Why are they so wide, since presumably the FIFO is still empty here? And why is the FIFO read initiated so close to the previous read? i.e. < 200 uS? Who/what initiates the FIFO reads?

4th pulse - Is this the correct pulse width when the FIFO is actually read?

Of course, I'm not sure, but there is no possibility that I will poll the fifo 2 times while there is a stream of 4 messages 500usec long. And in this case we never had a fault for all day. In the first picture I attached you can see that we poll the fifo at least 2 times during the messages streaming and we have the problem after few seconds.

Does this mean there is a problem only if you poll *twice*? i.e. there is no problem if you poll once when the FIFO is being filled?

This in my opinion would mean that the problem is in the way we manage the fifo, not in CAN baud rate configuration, or physical layer error or whatever.

Completely agree.

So form us the test to poll the fifo in a interrupt is meaningless.

Is the FIFO always read and emptied 4 messages at a time? i.e. when you poll the FIFO, do you decide to read and empty it only after all 4 frames are received? Or do you read whatever frames(s) have been received at that point?

The idea of using interrupts is to ensure the FIFO is not being read when messages are being received.

1) The DSP will answer only when the fifo will be full: we have protocol messages long 1 CAN messages, 2 CAN Messages, 4 CAN messages, and it is difficult to decide what would be the right length. Suppose we fix the length to 4 can messages, The DSP will not answer until the fifo is 4, and if the cortex transmit a protocol message 2 CAN messages long the DSP will no answer?

OK, so Cortex can transmit data in 1, 2 or 4 frames? I was under the assumption that it is always 4 frames at a time. If you configure the FIFO length to be 4 and only 2 messages have arrived, the CPU indeed will not get alerted.

3) I ask you the same question you ask me: since communication from Cortex happens asynchronously, how do you ensure that your code absolutely you services the interrupt before the commencement of the next transmission from cortex? There is no handshake between Cortex and DSP.

The application should ensure the interrupt is serviced before the arrival of the next set of frames. Since you already clarified that the frame length is not constant, you either have to generate an interrupt for every frame or simply poll as you are doing now.

did you understand how to read the CAN register in the register view or in the memory map of the debugger of CCS 8.3.1?

I did not. I only have CCS versions 9.3.0 and 10.0.0 in my PC.

0 Hareesh Janakiraman over 3 years ago in reply to Hareesh Janakiraman

TI__Guru* 93985 points

Just to close out this thread: (debug was taken offline)

Issue was with the CANMessageGet () function in can.c file in C:\ti\controlSUITE\device_support\F2837xD\v210\F2837xD_common\driverlib directory.

The last argument in that function was set to "true". This had the effect of clearing the interrupt (which itself was not needed, since application used polling) and also the NEWDAT bit. Since the NEWBAT bit was cleared, a new frame was copied into that mailbox before the mailbox was read. Also, the RxMsgLst bit was not set either. Making the last argument of the function "false" resolved the issue.

0 Andrea Marcianesi38 over 3 years ago in reply to Hareesh Janakiraman

Intellectual 355 points

This resolved my ticket.

Thank you Hareesh for your support.

Regards,

Andrea Marcianesi

C2000™︎ microcontrollers

C2000 microcontrollers forum

CAN MODULE Problem