This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM3352: DCAN issue with BUS-OFF

Part Number: AM3352

Tool/software: Linux

I'm facing a issue with DCAN and Linux in bad network conditions. (long cables, poor connectors or terminator resistors). Usually with the auto-restart option on network subsystem on Linux, DCAN is able to restart CAN after a BUS-OFF automatically. But after some minutes or hours with toggling between ACTIVE, PASSIVE and BUS-OFF state, the CAN is stopped and doesn't communicate anymore. If I inspect the network subsystem, it looks like DCAN is in PASSIVE mode:

ip -details -statistics link show can0
2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 50
    link/can promiscuity 0
    can <TRIPLE-SAMPLING> state ERROR-PASSIVE (berr-counter tx 248 rx 5) restart-ms 500
          bitrate 250000 sample-point 0.875
          tq 250 prop-seg 4 phase-seg1 9 phase-seg2 2 sjw 1
          c_can: tseg1 2..16 tseg2 1..8 sjw 1..4 brp 1..1024 brp-inc 1
          clock 24000000
          re-started bus-errors arbit-lost error-warn error-pass bus-off
          123        0          0          131        135        123
    RX: bytes packets errors dropped overrun mcast
    342850    71830   1      0       1       0
    TX: bytes packets errors dropped carrier collsns
    30738     17049   0      1807    0       0


But the DCAN controller is in BUS-OFF state: (Error and Status Reg: 0x481D0004: 0x000000E4). In the CTRL register it is visible that the INIT flag is set  (CTRL Reg 0x481D0000: 0x0000000F) I guess this is normal in BUSS-OFF conditions.

For me it looks somehow Linux miss an state change interrupt for BUS-OFF and thats why Linux doesn't restart CAN. But this is just an assumption.

Linux: ti-lsk-linux-4.1.y

Some hints would be appreciated!

Thanks

Anton

  • Hi,

    "ti-lsk-linux-4.1.y" - what Linux version is this? Where did you get it from?
  • Hi Biser

    The repository is from git.ti.com/ti-linux-kernel/ti-linux-kernel.git. The branch we use is ti-lsk-linux-4.1.y

    Regards
    Anton
  • We are a step further.
    If the CAN controller is in a situation where he stays in BUS-OFF (ES: 0x481D0004: 0x000000E4) and Linux wasn't informed about, then the EIE, SIE, IE0 and Init flag is set in CTL register (0x481D0000 => 0x0000000F).
    But as far I understand, this mustn't happen, since the interrupt handler should disable EIE, SIE and IE0 immediately. For me it looks like the BUS-OFF occurs while Linux handles the passive state in c_can_poll, while the interrupts are disabled. At the end of c_can_poll the interrupt are enabled again but the BUS-OFF is missed. (INT: 0x481D0010: 0x00000000, INTPND_X: 0x481D00AC: 0x00000000 )

    Because we don’t see any improvement in the source code upstream, we created a work around which overcome this situation. After the interrupt are enabled, we check if the CAN controllers INIT flag is set. If this is the case we trigger the napi job like the ISR does:

    static void c_can_irq_control(struct c_can_priv *priv, bool enable)
    {
    u32 ctrl = priv->read_reg(priv, C_CAN_CTRL_REG) & ~CONTROL_IRQMSK;

    if (enable)
    ctrl |= CONTROL_IRQMSK;

    priv->write_reg(priv, C_CAN_CTRL_REG, ctrl);

    + /* detect if interrupts are stalled */
    + ctrl = priv->read_reg(priv, C_CAN_CTRL_REG);
    + const u32 STALLED_MASK = CONTROL_INIT | CONTROL_IRQMSK;
    + if ((ctrl & STALLED_MASK) == STALLED_MASK)
    + {
    + printk(KERN_ERR "CAN interrupts stalled. Scheduling NAPI.\n");
    + /* handle STATUS_BOFF */
    + /* disable all interrupts and schedule the NAPI */
    +
    + c_can_irq_control(priv, false);
    + napi_schedule(&priv->napi);
    + }

    }

    I'm wondering if somebody have similar experience with D_CAN implementation or if somebody can made some further suggestion?
    Best Regards
    Anton