This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28375D: Missing CAN frames in Tx

Part Number: TMS320F28375D

I previously had a CAN Tx implementation using two mailboxes (17 and 18). There are about 18 to 36 nodes on the CAN bus, and each node sends 3 messages per second.

Unfortunately, I’m seeing missing CAN frames on the bus. At times, one of the CAN nodes goes completely silent for more than 30 seconds. This is not a bus‑off situation—I checked the error counters, and they remain at 0.

In my Tx implementation, three messages (with different IDs) are buffered in a software queue and then moved to the IFCMD registers after checking whether any of the mailboxes are free.

My debugging shows that the mailboxes are free, but after sending via the IF CMD registers, some messages still do not appear on the CAN bus.

Could you help me understand how to debug this further?

I also made a new change where I intentionally invalidate the existing mailbox before sending a new message through the IFCMD registers. This significantly improved performance, and now I see only very few missing CAN frames.

Please refer to the code below:

can_ll_send_message(uint32_t ui32ObjID, tCANMsgObject *pMsgObject)
{
    uint32_t ui32CmdMaskReg;
    uint32_t ui32MaskReg;
    uint32_t ui32ArbReg;
    uint32_t ui32MsgCtrl;
    bool bTransferData;

    bTransferData = 0;

    // Wait for busy bit to clear
    while(HWREGH(CAN_BASE + CAN_O_IF1CMD) & CAN_IF1CMD_BUSY)
    {
    }

    
    // New START 
    // Clear the MsgVal bit in the target mailbox first.
    HWREGH(CAN_BASE + CAN_O_IF1ARB) = 0;
    HWREGH(CAN_BASE + CAN_O_IF1ARB + 2) = 0;
    
    // Send only the arbitration clear command to the mailbox.
    HWREG_BP(CAN_BASE + CAN_O_IF1CMD) = CAN_IF1CMD_DIR | CAN_IF1CMD_ARB | (ui32ObjID & CAN_IF1CMD_MSG_NUM_M);

    // Wait for this clear command to finish.
    while(HWREGH(CAN_BASE + CAN_O_IF1CMD) & CAN_IF1CMD_BUSY)
    {
    }
    // New END

    // This is always a write to the Message object as this call is setting a
    // message object.  This call will also always set all size bits so it sets
    // both data bits.  The call will use the CONTROL register to set control
    // bits so this bit needs to be set as well.
    ui32CmdMaskReg = (CAN_IF1CMD_DIR | CAN_IF1CMD_DATA_A | CAN_IF1CMD_DATA_B |
                      CAN_IF1CMD_CONTROL);

    // Initialize the values to a known state before filling them in based on
    // the type of message object that is being configured.
    ui32ArbReg = 0;
    ui32MsgCtrl = 0;
    ui32MaskReg = 0;

        // Set the TXRQST bit and the reset the rest of the register.
        ui32CmdMaskReg |= CAN_IF1CMD_TXRQST;
        ui32MsgCtrl |= CAN_IF1MCTL_TXRQST | CAN_IF1MCTL_NEWDAT;
        ui32ArbReg = CAN_IF1ARB_DIR;
        bTransferData = 1;


    // Set the Arb bit so that this gets transferred to the Message object.
    ui32CmdMaskReg |= CAN_IF1CMD_ARB;

    // Configure the Arbitration registers.
    // Set the 29 bit version of the Identifier for this message object.
    // Mark the message as valid and set the extended ID bit.
    ui32ArbReg |= (pMsgObject->ui32MsgID & CAN_IF1ARB_ID_M) |
                  CAN_IF1ARB_MSGVAL | CAN_IF1ARB_XTD;

    // Set the data length since this is set for all transfers.  This is also a
    // single transfer and not a FIFO transfer so set EOB bit.
    ui32MsgCtrl |= (pMsgObject->ui32MsgLen & CAN_IF1MCTL_DLC_M);

    // Mark this as the last entry if this is not the last entry in a FIFO.
    if((pMsgObject->ui32Flags & CAN_MSG_OBJ_FIFO) == 0)
    {
        ui32MsgCtrl |= CAN_IF1MCTL_EOB;
    }


    // Write the data out to the CAN Data registers if needed.
    if(bTransferData)
    {
        CANDataRegWrite(pMsgObject->pucMsgData,
                        (uint32_t *)(CAN_BASE + CAN_O_IF1DATA),
                        pMsgObject->ui32MsgLen);
    }

    // Write out the registers to program the message object.

    HWREGH(CAN_BASE + CAN_O_IF1MSK) = ui32MaskReg & CAN_REG_WORD_MASK;
    HWREGH(CAN_BASE + CAN_O_IF1MSK + 2) = ui32MaskReg >> 16;

    HWREGH(CAN_BASE + CAN_O_IF1ARB) = ui32ArbReg & CAN_REG_WORD_MASK;
    HWREGH(CAN_BASE + CAN_O_IF1ARB + 2) = ui32ArbReg >> 16;

    HWREGH(CAN_BASE + CAN_O_IF1MCTL) = ui32MsgCtrl & CAN_REG_WORD_MASK;

    // Transfer the message object to the message object specific by ui32ObjID.
    HWREG_BP(CAN_BASE + CAN_O_IF1CMD) = ui32CmdMaskReg | (ui32ObjID & CAN_IF1CMD_MSG_NUM_M);


    // Wait for busy bit to clear
    while(HWREGH(CAN_BASE + CAN_O_IF1CMD) & CAN_IF1CMD_BUSY)
    {
    }

    return;
}


It would be great if you could help explain what might be happening.
Note: I don't want to use Tx interrutps because of my application constraints.

Thanks,
Reeno Joseph

  • Hi Joseph,

    When you write to the IF registers to update a message object that currently has its TXRQST bit set in the Message RAM, a race condition occurs between the CPU Interface and the Message Handler.

    The Mechanism of the "Silent" Failure
    1. The Overwrite: The CPU writes a new message to the IF1 registers and triggers the transfer to Message RAM.
    2. The Collision: Simultaneously, the Message Handler is trying to "read" that same mailbox because the previous TXRQST was already active.
    3. The Desync: The Message Handler sees the MsgVal bit is still high but the data/control bits are in flux. It can get stuck in a state where it believes the mailbox is "in use" by the CPU, so it skips it for the current arbitration cycle. However, because the TXRQST bit was successfully written into RAM by the CPU, the hardware thinks a request is pending and won't naturally clear it.
    4. The Result: The mailbox is "Locked." TXRQST is set in RAM, but the Message Handler has "lost the scent" and will never pick it up to put it on the wire. This explains why your node goes silent for 30 seconds—it’s waiting for a "Done" signal that will never come.

    But, I noticed you cleared IF1ARB (MsgVal=0) before sending data to mailbox before sending data to mailbox. This action will reset the mailbox, should resolve the issue.

    When the node goes silent, read the CAN_MSGVLD and CAN_TXRQ registers. If 17 and 18 are "Valid" and "Requesting" but never sending, you have a hardware arbitration lockup.

  • Thansk QJ Wang for the explanation.

    In the shared code, the last while loop was a recent update I added. Our buggy implementation was pushing new CAN frames to the IF registers and then immediately (within another function) checking CAN_TXRQ_21. Later, I realized that we need to check CAN_IF1CMD.BUSY before reading CAN_TXRQ_21.

    In our implementation, before the actual transmission was completed, we assumed the mailbox was free and pushed a new CAN frame into it. As you explained, this can cause a lock inside the CAN core.

    I'm planning to keep the mailbox invalidation logic, as it will force the CAN message handler to perform a fresh sweep.


    Thanks,
    Reeno Joseph