PROCESSOR-SDK-J721S2: Relate CSIRX CRC error to particular frame

Part Number: PROCESSOR-SDK-J721S2

Tool/software:

When CRC error comes within a frame, this error callback will be immediately called, but this frame will be still captured in the memory, so there will still be frameCallback from the driver.

So the errors are received frames are reported asynchronously to each other?

Then, is there any reliable way to know which particular frame buffer returned corresponds to the erratic frame?

There is a documented FVID2 return code for CRC errors (FVID2_FRAME_STATUS_CRC_ERROR) however I can't find this error code ever returned in the csirx driver sources.

Perhaps can receive both frame notifications and error notification and assume that the next frame returned after the error is problematic, however:
(1) is the callback order guaranteed? can't it happen that notification about the error close to end of frame comes after notification about receive? or, can't notification about error close to the beginning of a frame come before notification about previous frame receive?
(2) in case of multiple frames (for different VCs) are received by the same CSIRX instance, how to know which particular one was received with an error?

  • Hi Nikita,

    Then, is there any reliable way to know which particular frame buffer returned corresponds to the erratic frame?

    No, since in the current driver, both error callback and end of frame callback as async, its very difficult. 

    (1) is the callback order guaranteed? can't it happen that notification about the error close to end of frame comes after notification about receive? or, can't notification about error close to the beginning of a frame come before notification about previous frame receive?

    No, depending on the error generation, it may not possible to guarantee order. 

    (2) in case of multiple frames (for different VCs) are received by the same CSIRX instance, how to know which particular one was received with an error?

    Probably, based on the callback timestamp and frame's its own timestamp, you could mark the frame with the ECC error. 

    Regards,

    Brijesh

  • Probably, based on the callback timestamp and frame's its own timestamp, you could mark the frame with the ECC error. 

    I think that in typical multi-camera receive via 4-channel gmsl link, frames come interleaved, and the timestamps from all channels are nearly the same.

    So looks like the only doable approach is - notice timestamp of error callback, and assume any frame from any channel with timestamp between T and T+frame_time is broken.

    Not good at all.

    Is hardware indeed not capable of notifying the DMA machine about the error happened - so DMA machine could mark the descriptor as erratic?  Looks like this is the only reliable approach...

  • I dont think there is any way in CSIRX to identify the channel generated the error event..

  • I'm still looking at two options for better detection of bad frames:

    (1) CSIRX_error_debug register - can't information from there be used somehow?  Although looks like it is unavoidably racy against second error happening before first one processed by software.

    (2) "No error bypass mode" - can't this be used to force stopping receive of a frame after an error, thus turning frames with CRC errors into "short frames" at DMA level?  Although this won't catch a CRC error in the last line... but, last line is embedded line that is protected by CRC itself, so perhaps ignoring CRC error in the last line is not a big issue.

  • (1) CSIRX_error_debug register - can't information from there be used somehow?  Although looks like it is unavoidably racy against second error happening before first one processed by software.

    Yes, that's true, this register reports the channel id and data type for which this error is generated, but this is also async and depends on if there are errors back2back. 

    (2) "No error bypass mode" - can't this be used to force stopping receive of a frame after an error, thus turning frames with CRC errors into "short frames" at DMA level?  Although this won't catch a CRC error in the last line... but, last line is embedded line that is protected by CRC itself, so perhaps ignoring CRC error in the last line is not a big issue.

    Yes, can be used, but is this what you want to do?

    Regards,

    Brijesh

  • is this what you want to do

    I want to mark bad frames as bad and good frames as good.  I'm really surprised by the fact that there is no valid support for this, because looks like random bitflips in the channel are not that rare (I've seen some).

    Up to now I was proposing architecture to ignore CRC errors at GMSL (tunneling mode) level and handle those at CSIRX level - exactly for the reason that it is not possible to map error to frame in transit, but shall be possible at receive.  Lack of support for such a mapping at CSIRX is surprising and confusing. If it is just a driver limitation than I'm ready to improve the driver...  but looks like CSIRX hardware is not capable to pass error information in a usable way???  If that's the case, it looks like a major design flaw :(

  • I want to mark bad frames as bad and good frames as good.  I'm really surprised by the fact that there is no valid support for this, because looks like random bitflips in the channel are not that rare (I've seen some).

    Well, no support in the driver, but hw supports by registering it in vc and dt field from in the register.. You would need to read them to figure out which channel is erroneous and mark it as good/bad.

    Up to now I was proposing architecture to ignore CRC errors at GMSL (tunneling mode) level and handle those at CSIRX level - exactly for the reason that it is not possible to map error to frame in transit, but shall be possible at receive.  Lack of support for such a mapping at CSIRX is surprising and confusing. If it is just a driver limitation than I'm ready to improve the driver...  but looks like CSIRX hardware is not capable to pass error information in a usable way???  If that's the case, it looks like a major design flaw :(

    I dont think this is a design flaw, its supported in the HW.. 

    Regards,

    Brijesh

  • hw supports by registering it in vc and dt field from in the register

    Do you mean CSIRX_error_debug register?

    Do I understand correctly that this register is written at the time when the error gets detected, overwriting any previous value there?

    There is unavoidable delay between the moment when the error happens, and the moment when this register is read by software error handler. If within this delay one more error happens, then information about first one will get overwritten - unless hardware supports some queueing of this information. Is anything like that supported?

    Still, maybe you can ask HW engineers familiar with this IP, if there is some way to pass error information to the DMA stream where the erratic data has been sent, so it ends as an error mark in the DMA transfer status? Does PSIL support this sort of notifications? This is actually the way how common hardware deals with errors - e.g. any network adapter delivers receive errors via status fields attached to received frames.

    In safety context, loosing error information is obviously worse than providing false positives... due to that, using a register that remembers only one error to mark bad frame does not look like a safe implementation.  Instead, will have to mark any frame within frame-time interval from the error as unreliable.

  • Do you mean CSIRX_error_debug register?

    Do I understand correctly that this register is written at the time when the error gets detected, overwriting any previous value there?

    Yes, this is the register i meant earlier.

    There is unavoidable delay between the moment when the error happens, and the moment when this register is read by software error handler. If within this delay one more error happens, then information about first one will get overwritten - unless hardware supports some queueing of this information. Is anything like that supported?

    No, HW does not support any such queeuing.

    Still, maybe you can ask HW engineers familiar with this IP, if there is some way to pass error information to the DMA stream where the erratic data has been sent, so it ends as an error mark in the DMA transfer status? Does PSIL support this sort of notifications? This is actually the way how common hardware deals with errors - e.g. any network adapter delivers receive errors via status fields attached to received frames.

    I dont think this information is passed to PSIL, will still confirm the same. 

    Regards,

    Brijesh