This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TCAN4550-Q1: SPIERR when no CAN ACK. TXFQS resets after SPIERR.

Part Number: TCAN4550-Q1

Tool/software:

Hi,

TCAN in my system works fine as long as there is ACK on CAN. If there is no ACK I got SPIERR. Sometimes it happens after hundreds of ms sometimes after around a minute but it never happens when ACK is on CAN. Strange thing is that it seems to be dependent of working time of the system. For example after a couple of hours this error happens after about a minute and then with every try it happens faster (sometimes hundreds ms). 

  • I use SPI CLK set at 16 MHz, but I have tried at 8, 4, 2 MHz and nothing changed.
  • 20 MHz crystal with 10 pF. I have also tried 18 pF and 40 MHz and nothing changed.
  • When I read register 0x0008 I got 0x00110201.
  • TCAN does not send any messages to TCAN. (So TCAN RX is empty)
  • I clear M_RAM at the beggining.
  • My M_RAM does not overlap and is configured properly. (Rx0NumElements 64, TxBufferNumElements 32, all other elements size 0, all ElementSize 8 byte data, TX as FIFO)
  • The error only happens if DAR is set to 0. When I set it to 1 SPIERR does not occur. I need DAR to be 0.

// TxRoutine Pseudo Code

// Step 1: Read the Transmit FIFO/Queue Status register
TXFQS = readRegister(0x10C4) // TXFQS: Transmit FIFO/Queue Status

// Step 2: Check if Tx FIFO is not full and free level grater than zero
if (TXFQS.TFQF == 0 && TXFQS.TFFL > 0) {

    // Step 3: Calculate the base address to write the message
    // 0x8400 is the start address of the Tx FIFO (after RX0)
    // Each Tx buffer is 16 bytes
    address = 0x8400 + (16 * TXFQS.TFQPI)

    // Step 4: Write the message to the calculated Tx buffer address
    write(address, sizeof(4 * u32), msgBuff)

    // Step 5: Set TxBar
    tmpTxBar = 0
    tmpTxBar = 1 << TXFQS.TFQPI // Set bit corresponding to Tx FIFO Put Index

    // Step 6: Trigger the transmission by TxBar
    write(0x10D0, sizeof(u32), tmpTxBar)
}

As you can see I do not add messages if FIFO is full and it will be if there will be no ACK.

So I will add messages and TFFL will be decresing and TFQPI will be increasing until TFFL will be 0, TFQPI will go back to 0 and TFQF will be 1. So I will be only reading TXFQS = readRegister(0x10C4).
(Although sometimes I saw TFGI incremented when no ACK so how was that possible?)

And then after some time I got interrupt - dev_ir 0x88 (SPIERR, GLOBALERR), mcan_ir 0x9810000 (TSW, EP, EW, PEA).

Then I handle SPIERR by first reading statusReg (0x000C) and it is 0xA00000C (Read_fifo_empty, Internal_error_log_write, Internal_error_interrupt, Internal_access_active).

Then I read 0x0010 and is it 0x0. Then I write 0xFFFFFFFF to statusReg (0x000C). Next I read 0x000C and it is 0x8.

Next I read EcrReg (0x1040) to clear CEL (CAN Error Logging).

I also read PSR (0x1040) to see BO, EP, EW and it is equal to 0x7E5 so BO (Bus Off) flag is set. Then I read CCCR (0x1018) and it is 0x1. Then I write 0x0 to CCCR (0x1018).

Now I clear interrupts so I write value of this dev_ir and mcan_ir to them. After this for debug I read them again and 0x4000 TXFF

After this TXFQF is all 0 so there is an error there because if TFFL == 0 then TFQF must be 1.

I do not know why SPIERR happens when no ACK and how to handle it. Why does it affect TXFQF? Looking forward for your response!

Regards,
Mateusz

  • Hello Mateusz,

    The relation between the CAN ACK pulse and SPI Errors is the underlying high speed clock (crystal) that is required by the digital core to process SPI communication and also by the MCAN controller to process CAN communication.  If this clock is disrupted, the digital core and MCAN controller are essentially paused until the clock is restored.  If this disruption occurs during a SPI or CAN message, this can result in errors.

    The TCAN4550-Q1 supports both a crystal and a single-ended clock through the OSC1 and OSC2 pins.  There is a voltage comparator on the OSC2 pin that checks for a voltage less than approximately 100mv to indicate a "grounded" pin and to switch the clock mode into the single-ended mode and make the OSC1 pin an input.  If the voltage on the OSC2 pin is greater than 100mV, the crystal amplifier will be enabled and source current out of the OSC1 pin to the crystal.

    In crystal mode there is an Automatic Gain Control (AGC) circuit that will try to regulate the amplifier current to maintain a peak-to-peak voltage amplitude in the oscillation waveform of approximately 1Vpp.  However, there is a minimum output current level that will always be sourced and if the amplitude exceeds 1Vpp with the minimum output current level, then there is a risk the lowest voltage level of the waveform on the OSC2 pin can drop low enough to cause the device to switch to single-ended clock mode and disable the current amplifier. 

    If this occurs, the digital core and MCAN controller will not have a functional clock, and the crystal oscillation amplitude will start to decay due to the energy loss from the resistance in the circuit.  Eventually the OSC2 voltage level will be above the comparator detection threshold and the device will switch to crystal mode again.  This process can then repeat in a cycle until loading conditions change.  Given enough time, a disruption may align with a SPI or CAN message resulting in an error.

    You mentioned that this seems to be dependent on the working time of the system.  This is another clue because the total load capacitance seen by the crystal includes the parasitic capacitance of the PCB traces and the OSC1 and OSC2 pin capacitance.  This parasitic capacitance is not as stable across temperature as an actual ceramic capacitor with a good temperature coefficient.  Therefore, as the system operates the temperature increases, and the total load capacitance tends to reduce.  This leads to an increase in the voltage amplitude making it more susceptible to a single-ended clock mode switch.

    You mentioned you tried with some different caps and crystal frequencies, but perhaps you need some more adjustment.  If this is the cause of your issues, then the solution is to reduce the amplitude of the oscillation waveform on the OSC2 pin.  This is done by adjusting the load caps and series resistor values in the crystal circuit.

    The recommended solution is to include a series "dampening" resistor between the OSC1 pin and the crystal to reduce the current flowing through the crystal which reduces the mechanical vibration resulting in a smaller voltage amplitude.  When a resistor is needed, 30-50 ohms is usually enough, but usually less than 100 ohms would resolve this issue.

    If a series resistor is not available in the circuit, then increasing the load capacitors to larger values will help absorb this current and keep the circuit within the range the AGC can regulate.  Increasing the caps by 2-4pF each can increase the OSC2 voltage level by an equivalent amount usually seen by a 30-50 ohm series resistor.

    You can find more information in the TCAN455x Clock Optimization and Design Guidelines Application Report (Link).

    Try increasing the series resistance between OSC1 and the crystal, or adding additional capacitance to the crystal and see if it resolves your communication errors.

    Regards,

    Jonathan