This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TCAN4550-Q1: TDCR, SSP issue on RX side. Missing ACK and increased RX error rate due to a change in TDCR

Part Number: TCAN4550-Q1
Other Parts Discussed in Thread: TCAN4550,

Hello, I have some questions on how to properly define some compensation values in TCAN 4550:

We want to fine tune the sample point and synchronization for CAN FD, and we have some issues understanding how the various parameters affect the timing diagrams.

the User guide states the following:

In systems with CANFD and bitrate switching enabled, there will frequently be an additional propagation delay offset needed to properly sample bits. This delay is called the transmitter delay compensation and has its own register called TDCR(0x1048). If this value is not set properly, high speed data payloads will likely interpret the data incorrectly, or go into an error state.


However the proper setting for this register is not clear.

 Transmitter Delay Compensation Offset

0x00-0x7F - Offset value defining the distance between the

measured delay from m_can_tx to m_can_rx and the secondary

sample point. Valid values are 0 to 127 mtq.

 

Transmitter Delay Compensation Filter Window Length

0x00-0x7F - Defines the minimum value of the SSP position,

dominant edges on m_can_rx that would result in an earlier SSP

position are ignored for transmitter delay measurement. The

feature is enabled when TDCF is configured to a value greater

than TDCO. Valid values are 0 to 127 mtq.

 

Based on the above statements our hypothesis was that the TDCR will act only on packets generated by the TCAN4550, with self compensation set correctly it should verify all the transitions to recessive and identify correctly an error state or a stuck dominant.

We observed exactly that, moving the sample point position did improve on the TX side and we removed a timing issue we were experiencing.

However we also observed a change in other behaviors, where the other “modules” connected to the bus would fail to acknowledge in some cases, or the TCAN4550 forced an error on the bus in seemingly random places.

We did not expect that a change in the TDCR would impact the RX side of the communication and/or the behavior of other modules.

In the following image we can see the bus states with the added RX and TX of the (external) transmitting module.

The error state is forced by the TCAN4550 and followed by the other modules that recognize the error.

It seems like an issue of sampling point on RX side but it only happens when we regulate the TDCR

  

So to wrap up:

after reading other posts i now understand the inner working of the SSP and TDCR (also with the help of Bosch M-CAN user guide that was recommended)

and this only further increase my doubts about the measured value. the above behavior is not at all present when the SSP is 1tq shorter, but being on the receive side it shouldn't matter. the only point where i expect some issues can be related to ACK response, and in the same situation (only with longer SSP) we can find exactly some missing ACK, from both side of the communication, but i cannot understand how to debug the issue.

The modification was implemented due to some TX errors that was caused by a mismatch between SP and SSP, that was solved with this fix, but we cannot explain the reason why we have extra issues, mostly in the receive side.

i understand that this post may be a little confused and lacking info, but i can provide more information as needed in the thread.

Thank you

  • Hi Pietro,

    As you are aware, the TCAN4550-Q1 uses the MCAN IP developed by Bosch without any form of modification.  The Transmitter Delay Compensation is part of the MCAN IP and TI doesn't have any additional information on the TDC feature that is more thorough than what you have read in the M-CAN user guide.

    The TDC mechanism should only be used on when the transmitter is active, and it should not have an impact on received messages transmitted from other modules.  This mechanism is simply a way to verify the transmitted bit's value when the bit period is shorter than the loop delay which would cause a mismatch between the current value of TX and RX bits.

    I can understand how an incorrect value of the TDC settings could cause the TCAN4550 to throw an error on it's own messages if the delay was on the edge of being either too short, or too long.  But the TDC shouldn't be able to impact the bit period or formatting for the messages in such a way that other modules would see this as an error.

    Have you verified that the SP is the same for all modules on the bus for both the Nominal and Data bit timing parameters?  The bit rate switch occurs at the Sample Point in the Bit Rate Switch (BRS) bit in the message.  I have seen many errors arise when the modules on a bus have the same bit rates, but different SP's.  This is because the bit rate occurs before the modules with the later SP's are ready and this leads to a sampling error in the first FD bit which causes an error to be thrown.

    What are your Nominal and Data bit timing configurations and clock frequency?  Also, what are your TDC settings?

    Regards,

    Jonathan

  • Hello, thank you for the rapid response.
    The whole systems uses different topologies for the CAN, depending on the main uController of the module.
    there is a little difference in Clock feed, most of the module can benefit from a 80MHz clock, while TCAN4550 uses a 40MHz.
    The CAN-FD uses 1MHz/8MHz for the different phases and we payed attention to match the SP of all of them.
    On your suggestion we are checking all the SP/SSP of the various modules to see if any bug have remained.

    In any case, i put here some of the CAN FD specification:

    We are using 1Mhz/8Mhz with SP of 60% on all modules.

    to match SSP with the SP do i need to consider also the sync tq?

    in our case we set:

    TimeSeg1 = 1
    TimeSeg2 = 1

    and we expect a total of 3/5 tq, so 60%. and we now matched TDCO to 3 to match the SP. Before that we set TDCO to 2 and this led to erroneous readback.
    in addition to that we have TDCF = 3 to filter out earlier edges. 

    One of the doubts that i have and i am now checking is the behavior of TDCV during these errors, i want to be sure that the above errors are not caused by the inhibition of TDCF due to it being equal to TDCO.

    with the earlier state we had TDCO=2, TDCF=3, and we measured TDCV=8 so we should have little margin to eventually increase TDCF to 4 if needed.

    For now thank you for the info, we found it very helpful to continue with our measurements,

    I will come back again to this thread with the results of the above mentioned tests asap.

    Regards
    Pietro

  • Hi Pietro,

    I think your understanding is generally correct.

    The TDCV value is the combination of the internally measured loop delay based on the falling edge of the FDF bit in the message and the additional mtq needed to essentially set the sample point within the bit period (TDCO).

    The internally measured loop delay based on the falling edge of the FDF bit which will theoretically align the leading edges of the bit period between the TX and RX bits. Then the TDCO value determines how many additional mtq to wait before sampling the bit.  Therefore if the bit period has a total of 5 mtq, and you want a 60% sample point, TDCO = 3.

    Because the loop delay is measured based on the detection of a falling edge of the FDF bit, it is possible that noise or an erroneous edge could be detected resulting in a shortened "delay value" to use in the TDCV.  Therefore a filter (TDCF) can be setup to ensure that a minimum value for the SSP is used. 

    Therefore we want to ensure that the TDCF is as close to the desired SSP as possible without exceeding the actual ideal time (Internal Delay + TDCO).  Because we do not know what the internal loop delay measurement is, we typically set the filter to be equal to the offset (TDCF = TDCO) which ensures the filter will never be greater than the total.  If there is noise or a glitch that results in a shorter internal delay measurement, the SSP will end up being earlier in the bit period which could result in a bit error.  However, if the TDCF is set too large, it will result in a SSP that is later in the bit period or even after the bit period is complete which will also result in an error.

    The TDCV value reported should simply be the sum of the internal delay measurement and the TDCO.  Because the CAN transceiver and the MCAN controller are inside the same device, the loop delay should be shorter than if MCAN was in an MCU and an external CAN transceiver was connected with TX and RX pins and there was extra propagation delay from the PCB traces.  Therefore, I would not expect that you would see a lot of variance in the TDCV values on the TCAN4550.

    Regards,

    Jonathan

  • Sorry I am taking a little bit of time to respond, i have different tasks and sometimes they overlap.
    I overlooked the fact that the sampling of delay is made on the specific bit of FDF, this may open up more point of analysis:
    since the sampling is done in the arbitration phase there is no risk of multiple edges due to high speed / long delays that is a very good behavior.

    I now have an additional doubt that i need to take care, in our system the transitions between dominant and recessive bits have different delays due (i think) to the functionality of the drivers, recessive bits have inherently lower driving strength.

    since the sampling is done on FDF bit i think we may be compensating delay on the "best case scenario" and this, linked with a conservative 60% sample point choice, may lead to wrong interpretation of recessive bits in the data phase.

    these are two measurements done from a different transceiver that gives me access to both TX and RX and we can see almost 50ns difference in delay measurement.

    Considering tq=25ns:

    a bit in CAN FD is 5tq
    this difference is 2tq (40% of a bit)

    Regarding to the results of the above mentioned tests:
    We measured TDCV at 8tq, that is coherent with the above data, having an edge compensation of 4-5tq + TDCO 3tq gets to 8tq. 

    to be honest it seems a reasonable value, however i don't know the inner transceiver of the tcan4550, it may have more internal delay..

    at this point i think one choice could be to migrate to 80% SP, i think that the SP migration can help compensate this effect and ensure a more stable sample.

    i need to check again the reason of the 60% SP choice and ask my colleagues the possibility of the migration ( at least for a test), since it impacts the whole system. 

    Do you think that a test like an EYE diagram can help identify this issue? Lately i did eye testing with the help of a guide from Keysight and no issue had come up, but i may need to review the trigger method and mask limits for my application.

  • Hi Pietro,

    I believe that the typical recommendation for a SP and SSP is in the 80% range which helps address timing related issues like this.  It may be that your 60% SP may not be sufficient, and if you can run a test with 80%, I would recommend doing so.

    Recessive edges are not really "driven" and only rely on the bus resistance to pull the signals back to the common mode voltage. In this regards, different transceivers, different number of transceivers, and different wiring harness configuration will result in different amounts of bus resistance and capacitance that can affect the time it takes the recessive edge to fall below the comparator threshold inside the transceivers that sets the RX pin's digital value.  This can create a bit width distortion, which could lead to SP issues because the controller is sampling the bit based on the digital bit transitions and not the CANH/L edge transitions.

    Regards,

    Jonathan