This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-AM62A: Process CSI-2 RX Data on the fly for a low latency application

Part Number: PROCESSOR-SDK-AM62A

Tool/software:

Hi everyone!

Using different high level libraries like gstreamer or OpenVX, I have always measured at least 3 frames of delay between light stimulation of my MIPI CSI-2 sensor and the resulting "stimulated frame" in the linux user space. I am using the ISP to de-bayerize the images coming from my sensor.

A delay of 1 frame could be attributed to my sensor, but the rest should be on the AM62A side. Looking at the J721E documentation, I found this interesting table:

Instance

Configuration

Time taken to receive one frame

ISR latency

CSI2Rx Inst 0

1CH 1080P30 IMX390 Sensor Raw12

33.3ms (MCU2_0)

9us (MCU2_0)

Does that mean that another frame of delay comes from how j721e-csi2rx handles the incoming MIPI stream ? The last frame of delay would then be coming from the v4l2 driver ?

My need is to develop a piece of software that would process some of the MIPI data "on the fly", with the smallest latency possible. I don't need the ISP to process the data. Where should I start ? Am I missing something ?

Any information would be greatly appreciated,

Thanks!

  • Hi Aymeric,

    I have assigned your query to our expert. He is currently out of office for 2 weeks, so expect response when he will return to office.

    Apologies for the delay

    Best Regards,

    Suren

  • Hello Aymeric,

    Please check this camera mirror system app note: https://www.ti.com/lit/wp/spradc4/spradc4.pdf. It has latency analysis on CSI2-Rx and ISP. I should be able to provide further help when I'm back to office.

    Regards,

    Jianzhong

  • Hi Jianzhong,

    Thank you for your quick reply! I know mine isn't that quick but I wanted to test my affirmation before porsting. I looked at that very interesting white paper and I came across this table:

    Sensor @60 fps Frame 0
    CSI2-RX   Frame 0
    VISS   Frame 0
    LDC    Frame 0
    MSC    Frame 0
    Deep Learning   Frame 0
    Display        Frame 0
    Time (ms) 0 16.67 24.67 32.67 40.67  48.67 65.33



    If I understand well, each element is triggered by a new frame. This means that in my application, the best scenario would be this one if I use VISS module:

    Sensor @30 fps Frame 0
    CSI2-RX   Frame 0
    VISS   Frame 0
    App (OpenVX)
    Frame 0
    Time (ms) 0 33.33 66.66 99.99  


    Or this one without the VISS:

    Sensor @30 fps Frame 0
    CSI2-RX   Frame 0
    App (OpenVX)
    Frame 0
    Time (ms) 0 33.33 66.66

         

    If I don't use OpenVX or GStreamer, just plain video4linux using V4L2_MEMORY_MMAP or V4L2_MEMORY_DMABUF in C. The time before initialization (streamon) and the moment where I can dequeue my first buffer successfully appears to be indeed 1 frame.

    For my use case, I don't need to have a full frame to start processing, so I'd be interested accessing the maped buffer earlier than when it's fully ready and can be dequeued. Using MMAP I tried reading part of the buffer that is currently being filled by only queueing two buffers and reading that hasn't been dequed yet. Indeed, I get part of the image I can later dequeue. I have to make some other tests and modify my MIPI pattern generator card to make sure there is not another level of framebuffing but I think I am getting closer.

    If there is no other layer then I am happy with what I've got, I will start my operations on the first part of the image, which means my first result will be available after only 8.33ms if start after 1/4 of the frame is received.

    Sensor @30 fps Frame 0
    V4L2 MMAP 
    Line[0:270]
    V4L2 MMAP VIDIOC_DQBUF Frame 0
    Time (ms) 0 8.33 33.33

    My conclusion is that for this application, it's going to be hard to use OpenVX or Gstreamer as they are designed to work with whole frames so they have no choice but to wait for at least one frame.

    Just to be sure I am not reinventing the wheel, does anyone know if there is another API or if v4linux can be configured so that I can access the CSI video input as a stream of pixels or lines instead of whole frames ?

    Regards,

    Aymeric

  • If I understand well, each element is triggered by a new frame. This means that in my application, the best scenario would be this one if I use VISS module:

    No, that's not the case. VISS latency is not the frame duration, but the processing time. As long as the VISS finished processing, data will be available for next component in the pipeline. VISS processing time depends on the frame size.

    Just to be sure I am not reinventing the wheel, does anyone know if there is another API or if v4linux can be configured so that I can access the CSI video input as a stream of pixels or lines instead of whole frames ?

    I don't think there is such a mechanism.