AM625: v4l2 streamon caused board hangs

Part Number: AM625
Other Parts Discussed in Thread: AM620

Tool/software:

Hi:

We use 2 C files to test to capture frame through 2 streams:vc0 and vc1.

The 1st app will cause board hangs when kill it and run it again.But the 2nd app can always run when  kill it and run it again.

The 2 files both do these steps:

           open file

           set v4l2_foramt

           request v4l2_requestbuffers

           mmap buffers

           querybuf

           qbuf

           stream on

  Also add signal to do SIGTERM function, which will do these steps

          msleep  50

          stream off

         close fd

         munmap

The difference is 1st C file:

       while 1 {

             video0 dqbuf

             video0 qbuf 

             video1 dqbuf

             video1 qbuf

     }

2nd C file is

    pthread th0

    pthread th1

    th0 do while 1{

          video0 dqbuf

          video0 qbuf

     }

   th1 do while 1{

          video1 dqbuf

          video1 qbuf

   }

If run kill cmd, both of them run the signal handler.

Then run it again, the 1st app causes board hangs. But 2nd app can run normally.

About regs of CSI2RX_STREAM_STATUS_REG:

all bits are 0 when kill 1st app

bit 31 and bit 8 are 1 when kill 2nd app

Could you help us find the reason to cause board hangs by using 1st app?

  • Dear Tom.

    one more question about this issue, when you close device, did you disable transmission from FPGA?

    you should follow this, disable FPGA transmission -> stream off and close device; open device and stream on -> enable FPGA transmission.

    thanks a lot!

    yong

  • On this device, there is no FPGA transmission

  • Could you help us reslove this before 8/10

  • Dear Tom.

    would you please clarify the data flow? SPAD -> AM620 CSIRX?

    On this device, there is no FPGA transmission

    then did you stop SPAD transmission then close device (CSIRX)? also open device then start SPAD transmission?

    thanks a lot!

    yong

  • Hi Tom,

    Can you try to use "v4l2-ctl ---stream-mmap" or yavta to capture from the 2 video devices simultaneously and see if you run into the same hang problem?

    Thanks,

    Jianzhong

  • Hi Jianzhong:

     The command is "v4l2-ctl --verbose -d /dev/video0 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 2 --stream-count=10 --stream-poll & v4l2-ctl --verbose -d /dev/video1 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 --stream-count=10 --stream-poll & " 

     If I use this command to get frame, when I kill these 2 processes, it will not cause board hangs.

    But At first, I lost the "&" when run video1,like this "v4l2-ctl --verbose -d /dev/video0 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 2 --stream-count=10 --stream-poll & v4l2-ctl --verbose -d /dev/video1 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 --stream-count=10 --stream-poll"

    If I kill these 2 processes, it will cause board hangs

  • spad->cdns_csi2rx->ticsi2rx->dma_context

    streamon: run csirx start ,then do spad start stream

    streamoff: run spad stop stream. then do stop crsirx stop

  • spad->cdns_csi2rx->ticsi2rx->dma_context

    streamon: run csirx start ,then do spad start stream

    streamoff: run spad stop stream. then do stop crsirx stop

  • Dear Tom.

    may I ask if it have checked SPAD status when run SPAD stop stream, to make sure that SPAD has stopped data transmission?

    thanks a lot!

    yong

  • We wil check the spad status, only when the spad is idle, we will stop it

  • Hi Tom,

    How does the board hang look like? The system does not respond to any command? or just application hang?

  • I can't entry any command to board, and ping it fail. The debug uart can't send any log and get the command I send.

    Must power down and up

  • It seems like a system crash. Then how did you read CSI2RX_STREAM_STATUS_REG after the crash?

  • I read it when the 1st time kill progress.Then if I run the progress again, it crashes

  • What if you run "v4l2-ctl --verbose -d /dev/video0 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 2 --stream-count=10 --stream-poll & v4l2-ctl --verbose -d /dev/video1 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 --stream-count=10 --stream-poll &" after fist kill instead of running your own app?

  • If video 0 & and video1 & ,kill it and run app again, it can run normally

    If video 0 & and video 1, kill it and run app again ,it also carsh

  • I mean run your app which may cause the issue, then kill it. Afterwards, run v4l2 commands  "v4l2-ctl --verbose -d /dev/video0 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 2 --stream-count=10 --stream-poll & v4l2-ctl --verbose -d /dev/video1 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 --stream-count=10 --stream-poll &" . Please check if it still crashes. 

  • I tested:

    "v4l2-ctl --verbose -d /dev/video0 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 2 --stream-count=10 --stream-poll & v4l2-ctl --verbose -d /dev/video1 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 --stream-count=10 --stream-poll &"       kill and run again,  will not crash

    "v4l2-ctl --verbose -d /dev/video0 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 2 --stream-count=10 --stream-poll & v4l2-ctl --verbose -d /dev/video1 --set-fmt-video=width=2688,height=193,pixelformat='BG12' --stream-mmap=6 --stream-skip=1 --stream-count=10 --stream-poll"      kill and run again, must crash

  • Hi Tom,

    When you use "&", you're essentially putting the command into background. So your first command put both capture into background and your second command only put the first capture into background.

    I tried running two captures in two separate terminals, one through terminal console, and the other through telnet. I can stop and rerun without any problems.

    Back to your original question,

    If run kill cmd, both of them run the signal handler.

    Then run it again, the 1st app causes board hangs. But 2nd app can run normally.

    Your 1st app had captures from both video device nodes in the same thread. Each device node is associated with a DMA context. When you kill the app, the DMA operation for both nodes may not both be terminated cleanly. That could cause issues when you rerun the app.

    Which SDK version are you using? The CSI2 Rx driver has improved drain handling in SDK 11.0 that may help resolve your issue with the 1st app. If you're not using 11.0, I would recommend you to try it or 11.1 the latest release.

    Regards,

    Jianzhong

  • In 1st app, I request 2 different buffer structs for video 0 and video 1.

    And vidoe0 uses dma context 0 ,video 1 uses dma context 1.They are not associated.

     

    The SDK version is 11.1 or 11.0 or the lastest release

  • And vidoe0 uses dma context 0 ,video 1 uses dma context 1.They are not associated.

    The two DMA contexts are not associated, but they are transferring data from the same CSI stream (using different virtual channels). If you kill the app, I'm not sure if both DMA transfers are terminated properly.

    Please try the latest CSI2 Rx driver: git.ti.com/.../j721e-csi2rx.c

  • Dear Tom

    would you please share the update with the latest CSIRX driver?

    thanks a lot!

    yong

  • Sorry.It's late to reply to you.

    I merge all difference in my j721e-csi2rx.c.It builds fail.Then I fix some error and try the new driver to get frames, it can't get frames.

    Then I only merge the difference of these functions:

    ti_csi2rx_drain_callback

    ti_csi2rx_drain_dma

    ti_csi2rx_dma_callback

    ti_csi2rx_stop_dma

    Then I use my app to test, and the board will not hang if I kill my app and run it again.

    And the raw data of frames are right.

    It seems works.

    But dmesg, I can see these log:

    Failed to stop streaming on pad0

    Failed to stop streaming on pad1

    Failed to stop streaming on pad2

    Failed to stop streaming on pad3

    DMA transfer timed out for drain buffer

    DMA transfer timed out for drain buffer

  • Dear Tom.

    per discussed this afternoon, good to see that the hang issue disappeared. 

    I would like to ask your help to provide below materials for review.

    1. provide code change that you apply to the driver.

    2. provide two logs before and after the code change applied.

    thanks a lot!

    yong

  • Failed to stop streaming on                       this log is dev_warn

    DMA transfer timed out for drain buffer     this log is dev_dbg

    We can get the frames and the data of frames are right. The board will not crash anymore.

    So we can ignore these logs. right? They are not dev_err.

  • Dear Tom.

    okay,

    1. would you please provide full log w/ these prints? we will ask expert to help check that and answer your question.

    "Failed to stop streaming on"

    "DMA transfer timed out for drain buffer"

    2. also suggest to provide code change that you apply to the driver, for our expert to review.

    thanks a lot!

    yong 

  • Dear Tom.

    Would you please let us know if you still need support from TI? Or we will close this ticket in this week.

    thanks a lot!

    yong

  • Hi Yong:

    We still found the board will crash which we have added this patch.

    When my app is getting the frame data, if i power off my spad, then i kill my app and run it again, the board will crash.

  • Dear Tom

    why you need do this test? this would be an exception, the system should be reset at that time. right?

    thanks a lot!

    yong

  • Hi Yong:

    We discovered it accidentally during testing, and then traced it to this phenomenon.

    I'm not sure that the system will reset if we wait for some minutes.

  • Dear Tom.

    okay, is it normal case? why the spad is power off? would you check with team internally how do you handle this from system level?

    also, did you re-init SPAD during test?

    thanks a lot!

    yong

  • Hi Yong:

    Yes, One app controls the spad.If it is killed and run again, it will power off and on spad.

    If we handle this from system level, oneday the spad disconnects with soc, it may alse cause this problem.

  • Dear Tom.

    sorry, may I ask your help to clarify the test case? would you please help share control flow? 

    BTW, did you power on and re-init SPAD after it set AM620 CSIRX ready to receive data? CSIRX must be ready before data sends to CSIRX.

    If we handle this from system level, oneday the spad disconnects with soc, it may alse cause this problem.

    thanks a lot!

    yong

  • Hi Yong:

    1. power on spad

    2.run app to do v4l2 get frame data

    3.power off spad

    4.power on spad

    5.kill app and run again

    6.board crashes

    The normal flow should be power on spad, run app to do v4l2 get frame data, kill app.power off and power on spad, then run app

  • Dear Tom.

    when power on SPAD, does it mean that SPAD has been initialized and started to send data?

    Dear Jianzhong.

    would you please help review the control flow? 

    thanks a lot!

    yong

  • Hi Yong:

    When power on SPAD, it will be  initialized but not started to send data.

  • Dear Tom

    thank you so much.

    let me clarify the problem again. please correct me.

    workable case:
    1.power on spad
    2.run app to initialize AM620 CSIRX
    3.run app to stream on SPAD, it will do v4l2 get frame data.
    4.kill app
    5.power off spad
    6.power on spad
    7.run app to initialize AM620 CSIRX
    8.run app to stream on SPAD, it will do v4l2 get frame data again.

    error case:
    1.power on spad
    2.run app to initialize AM620 CSIRX
    3.run app to stream on SPAD, it will do v4l2 get frame data.
    4.power off spad
    5.power on spad
    6.kill app
    7.run app to initialize AM620 CSIRX
    8.run app to stream on SPAD, system hang?

    if there is no misunderstanding, please help clarify below question?

    a. what does it do when kill app?

    b. how does you simulate the action to power off spad? it is a kind of logic in app?

    thanks a lot!

    yong

  • Hi Yong:

    The workable case and error case are correct.

    When kill app, it will streamoff ,close file, unmap buffer and free buffer

    I just run our cmd to power off spad. It is not a kind of logic in app, it is a wrong operation, but we need to fix this. Becaue it will cause board hangs.

  • Dear Tom

    per discussed, I would like to ask your help to provide these for further check.

    1. Please help provide full log in error case, especially step3~6.

    2. please help refine the two cases, for example, what does it do when power off spad?

    3. please help tell us when the system hang? configure SPAD? queue buffer to driver and wait? or something else?

    4. please help share the code change that you have applied to CSIRX that comparing to latest CSIRX driver sent by Jianzhong, for review.

    thanks a lot!

    yong

  • Dear Tom.

    per discussed, let us downgrade this ticket priority as you have work-round. Please provide feedback to move on later.

    per discussed, I would like to ask your help to provide these for further check.

    1. Please help provide full log in error case, especially step3~6.

    2. please help refine the two cases, for example, what does it do when power off spad?

    3. please help tell us when the system hang? configure SPAD? queue buffer to driver and wait? or something else?

    4. please help share the code change that you have applied to CSIRX that comparing to latest CSIRX driver sent by Jianzhong, for review.

    thanks a lot!

    yong