This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

OMAP3530-ISP dropping fields from TVP5150.

Other Parts Discussed in Thread: OMAP3530, TVP5146, TVP5150, TVP5151

We are working on a custom multimedia platform based on the OMAP3530 EVM Rev. G from Mistral.  It is very similar to the EVM and use a modified kernel based upon it.

One hardware difference is that we are using a TVP5150 video decoder chip rather than the TVP5146 on the EVM.  We had to modify the 5146 driver as well as the ISP driver to support it.  The ISP driver had to be modified to ignore the lack of HS_VS interrupts and use VD0 interrupts for processing instead.  For some weeks now, we've had the drivers working and capturing video through the V4L2 interface, and encoding H.264 and MP4 using the DSP.  

Unfortunately, we've now noticed that we're not getting an interrupt on every field in the video stream.  This is resulting in corrupted frames and jerky, jittery video.  This is a major problem as we require 29.97 fps, high-quality video, and still-frame captures for our product.  Has anyone else come across this problem?  Are there any suggestions as to what might be causing this problem, and how we might go about fixing it?

  • Dennis,

    A few questions to try and pinpoint possible causes:

    1. What is the total response and processing time for your VD0 interrupt - just want to make sure you are not missing any because of overlapping IRQs. How did you determine you are missing interrupts and not dropping input frame data?

    2. Was this working properly at any time in the past (as you said for weeks you had the drivers working)?

    3. What is the synchronization between the incoming video stream and the display? Are you (double/triple) buffering the input to make sure you are not missing frames due to overwriting buffers (the latency through the CCDC-PRV-RSZ can be up to 3 frames).  Where are you intercepting the input data for encoding (i.e CCDC output, PRV/RSZ output..) and how is the data progressing through the ISP (i.e. video port or buffered)? When you say "jerky video" is this off-line playback, or real-time Preview from the ISP?

    Thanks

  • Hi Tiemen,

    I will answer your questions out of order:

    2.  I don't believe that it was ever working 100% perfectly.  When we thought it was working well, I knew it was dropping frames here and there when a field was occurring out of sequence, but I didn't realize the severity of the problem.  I have some code in the interrupt handler to check that the field that triggered the interrupt is a different field than the last field that triggered the interrupt.  They should come in order, field A, field B, field A, field B, and so on.  Sometimes they come like field A, field B, field B, field A, field B, field A, and so on.  When two field A's or two field B's come in back to back, the interrupt is ignored, a warning message is printed, and the frame is dropped.  That's occurred since the beginning when we got the TVP5150 driver working with the ISP.

    1.  It takes about 30 microseconds to process the first field of a frame, and <1 millisecond to process the second field of the frame.  I determined these timings by enabling CONFIG_PRINTK_TIME in the kernel and adding printk statements to the beginning and end of the handle_level_irq() function in kernel/irq/chip.c and then running the capture+ DSP H.264 encode pipeline with gstreamer:

    gst-launch -v v4l2src always-copy=FALSE num-buffers=900 queue-size=4 ! video/x-raw-yuv,height=480,width=720 ! queue ! TIVidenc1 codecName=h264enc engineName=codecServer byteStream=FALSE numInputBufs=7 numOutputBufs=5 ! mp4mux ! queue ! filesink location=/home/root/sample.mp4

    The timing data shows that when a fields are captured properly, the time between 2 successive VD0 interrupts is around 16.67 milliseconds, pretty much exactly what we'd expect for NTSC video.  But when it does miss interrupts and therefore miss fields, the time between 2 successive VD0 interrupts is around 33.34 milliseconds, pretty much exactly what we'd expect if there were a field that was skipped or didn't generate an interrupt.  The timing data also shows that the interrupt service routine is not being re-entered and is not currently handling another interrupt at the time that a missed interrupt should occur, thus IRQs are not overlapping.  I don't think I have any way to determine the latency between the interrupt and the handling of the interrupt, since the VD0 interrupt is internally generated.

    3.  We are not using the preview engine or the resizer at all, only the CCDC and DSP.  I believe the gstreamer pipeline creates up to 4 mmap'd buffers to pass into the v4l2 driver.  The CCDC is writing directly into the memory buffers, as far as I know.  CPU usage is about 60-80% during capture.  The jerky video occurs when I remove the part of the ISR that ignores a field if it comes out of order & processes it normally.  The video looks as if frames are written with fields from different frames, as you suggest, but also sometimes as if the field is not begun at the start of the frame, but before it.  I have a 32MB H.264/mp4 video that demonstrates it if you have a way for me to get it to you.

     

    Thanks,

    Dennis Estenson

  • Dennis, good food for thought.. thanks.

    From (1), it seems that you are not having a problem of stacking IRQ on top of IRQ and missing one that way, so that we can rule out. Next would be to see that if the frame drop is caused by the 5150 not providing it, or the OMAP not seeing the it. There are two ways I can think of to check:

    1. Monitor the 5150 output and see if the syncs are present - of course you cannot see SAV/EAV on the scope, but if you have a logic or protocol analyzer you could look for the pattern, assuming you have access to the data lines. Another thing to look at is how noisy the lines are, SAV/EAV could be misread if there is too much noise on the line.

    2. Check the CCDC to see if it gets the H and VSYNCs it needs to count down the frame (from the SAV info). The way the CCDC works, is that it will count out the number of pixels per line and number of lines per frame to generate the EOF interrupt to the processor starting from the VSYNC. So one thing you could do, is read out the status of the CCDC around the 17 ms timeframe after the previous frame interrupt and see what the counts and sync status tell you. This should tell you if the CCDC is actively receiving a new frame or if it is sitting waiting for the next frame to come in.

    3. Can you check to see what VD1 is doing? I know you are using VD0 as the interrupt, just want to see how VD1 behaves.

    Lastly, the newer versions of the SDK have a TI version of the 5150 driver in it, you may want to check that out.

     

  • Hi Tiemen,

    There is a new data point I discovered late on Friday that will likely take this discussion in a whole new direction.  It appears that the field/frame dropping problem goes away entirely when I record the file to a RAM disk or /dev/null rather than any of: an NFS folder, a USB thumb drive, or onto the on-chip NAND flash.  I would've liked to test it with an SD card as well, since that's where files will be recorded in our product, but there is an unrelated problem with our SD driver right now (the card detect interrupt is "disabled" in software, but we get the hardware interrupt) preventing me from successfully doing that now.

    I also checked the cpu usage as reported by dmaiperf while doing the capture/encode.  ARM load is between 60-65%, and DSP load is 85-90% when recording to /dev/null, ARM load is between 65-70% when recording to /tmp filesystem, and ARM load is between 75-80% when recording to NFS while it's capturing normally and dmaiperf reports over 75-115% ARM usage when the fields get out of sync.  Clearly the system is getting starved for cycles when the interrupts are missed and fields and frames get dropped.  60-65% ARM usage during capture and encode seems high when the DSP should be doing all the work. DSP load of 85-90% also seems high.  Do you know when the new Gstreamer DMAI plugin that uses DMAI allocated buffers to reduce the number of memcpys required during recording?  Can you think of any other ways to reduce the load on the ARM & the DSP when recording?  Does 65% and 90% seem normal or high to you?

     

    Thanks,

    Dennis

  • Hi Dennis,

     

    This is quite interesting, i would want to understand bit more about your use-case, I am looking for something like -

        - How the buffers/memory is shared between different processes, like ISP Capture driver, DSP, etc...

        - Have you done any code changes in ISP, especially the way buffers are being mapped or managed? It would be helpful if you could share the changes you did in ISP code-base

        - When you say you are hitting 60-65% cpu consumption, its too high for me, even if you are getting proper frames with this. Please refer to the datasheet figures which we publish for plane vanilla application on top of ISP-Capture driver in UserPtr mode.

    http://processors.wiki.ti.com/index.php/AM35x-OMAP35x-PSP_03.00.01.06_Feature_Performance_Guide#Performance_and_Benchmarks_4

    Is there any memory extensive operations carried out? Are you doing any cache related operation on buffers?

    Thanks,

    Vaibhav

  • Hi Vaibhav,

    Thanks for your interest.  I am using the following kernel command line:

    console=ttyS0,115200n8 noinitrd rw eth=c0:a0:b0:b0:e0:de ip=192.168.0.127 root=/dev/nfs nfsroot=192.168.0.215:/opt/omap-rfs,nolock mem=99M mpurate=600 omapfb.rotate=1 omapfb.rotate_type=1 omap_vout.vid1_static_vrfb_alloc=y

     

    My loadmodules script contains:

    insmod cmemk.ko phys_start=0x86300000 phys_end=0x87300000 pools=1x5250000,1x1429440,5x1048576,1x256000,4x829440,8x131072,20x4096

     

    I am using revision 652 of gstreamer + TI's DMAI plugin from https://gstreamer.ti.com/svn/gstreamer_ti/trunk/gstreamer_ti.  To compile the kernel, apps and libraries, I'm using Code Sourcery's G++ Lite 2010q1-202 (gcc ver. 4.4.1).  The command line for gstreamer I'm using is:

     

    gst-launch --gst-debug=TIVid*:3 v4l2src always-copy=FALSE num-buffers=300 queue-size=4 ! video/x-raw-yuv,height=480,width=720 ! queue ! TIVidenc1 codecName=h264enc engineName=codecServer byteStream=FALSE numInputBufs=7 numOutputBufs=5 contiguousInputFrame=FALSE ! dmaiperf engine-name=codecServer print-arm-load=true ! mp4mux ! queue ! filesink location=/dev/null

    with /dev/null changed to /tmp/filename.mp4 or /home/root/filename.mp4. Removing the 2 queue plugins from the pipeline does not significantly impact cpu usage (it might reduce it by 1% or so).

     

    The kernel we're using is a customized kernel branched from OMAPPSP_03.00.01.06 at http://arago-project.org.

    I don't know I'm authorized at this time to release the code for the exact changes that I've made to the ISP driver to make it work, but I can tell you that I didn't do anything to the way buffers are used.  I only changed the ISR to ignore the fact that HS_VS interrupts are not occurring, to use VD0 interrupts instead, and to add a printk("OMAP-ISP Warning: fields out of sync\n") when fld_stat == isp->current_field, to alert me when fields have been missed.

     

    At the time of benchmarking, the system is idle with only background processes running, and gst-launch in the foreground.

     

    Thanks,

    Dennis

  •  

    Why would you not get HS_VS interrupts when using the tvp5150 instead of the tvp5146?  Aren't they both used in bt.656 mode, and don't you still get HS_VS interrupts from the embedded sync when using the parallel interface with embedded sync?

    I expected that switching to the tvp5150 wouldn't require any changes in the ISP code?

    Thanks,

    Chris

     

  • Hi Cris,

    I think you are miss-understanding my last post, let me clarify one more time here,

    From ISP-CCDC point of view it really doesn't matter what is being interfaced externally, whether it is TVP5150 or TVP5146, since the incoming steam is BT656 compliant.

    The issue I am talking about here is, ISP-CCDC is not able to generate HS/VS interrupts for BT656 streams (for both TVP5150 & TVP5146). I do not have any clue right now, whether this is expected behavior or some hardware errata. But as per TRM, input sampler block should decode SAV/EAV sync signals and generate HS/VS interrupts to CCDC which is not happening.

    Thanks,

    Vaibhav

  •  

    Vaibhav,

    Yes, after experiements, I do see that the embedded sync of bt.656 does not generate hs & vs interrupts that I expected.  This isn't good since the isr code in isp.c relies on getting these to capture anything.  Would TI be able to provide a modified isp.c which doesn't require hs_vs interrupts to work?  I tried setting wait_hs_vs to zero, and maually flipping isp->current_field every vd0 interrupt, but it still didn't work.  It seemed to call isp_buf_process() as expected, but then it would always get an error that it timed out waiting for ccdc to go idle.

    thanks,

    chris

     

  • Chris,

    I've also seen this behavior.  I increased the timeout you mention and that prevents the error message, but it doesn't fix the error condition, and may be related to the problem I've been seeing.

    Vaibhav,

    My supervisor's putting pressure on me to get the field dropping problem resolved this week.  Do you have any other recommendations for me?  I've found that the saMmapLoopback example application takes about 10-15% of CPU for 30fps capture and display, so that seems to provide a baseline for performance on our hardware.  This number seems awfully high when compared to the benchmarking data you provided, but we could live with it.  I've found, additionally, when I enable the composite video out instead of DVI, the calculated framerate drops to 25fps and a large number of fields are missed.

    Could there be something with our hardware or software setup at a very low level, like DDR timings, or cache mis-calibration, or something else systematic that would cause the CPU to use so many cycles doing something fairly simple?

    Thanks,

    Dennis

  •  

    ah!  ok.  You increased the timeout from the 10 msec default. 

    I found that after I get the VD0 interrupt, it takes around 14.9 msec before the CCDC busy flag clears.  At 60hz, this is basically a whole field time!  During that 15 msec, the code is probably spinning in the busy loop waiting for busy flag to clear.  Maybe this is what causes high CPU utilization on the ARM side?

    Very shortly after the busy bit clears (like 1.8 msec later), I get the next VD0 interrupt.  I'm still trying to figure it out, but it's almost like I'm getting VD0 interrupt at the beginning of the field instead of the end of the field.  Not sure if this could cause you dropped frames?

    Chris

     

     

     

  • Hi Dennis,

    Let me break your question,

     

    >>>My supervisor's putting pressure on me to get the field dropping problem resolved this week.  Do you have any other recommendations for me?

    [Vaibhav] We have to change our ISR routine not to use HS/VS interrupts and only use VD0 & VD1 interrupts, whole buffer processing logic will now dependent on VD0 and VD1 interrupts.

     

    >>> I've found that the saMmapLoopback example application takes about 10-15% of CPU for 30fps capture and display, so that seems to provide a baseline for performance on

    >>> our hardware.  This number seems awfully high when compared to the benchmarking data you provided, but we could live with it.

    [Vaibhav] The bench-marking data which I have shared is captured with different application (saUserPtrLoopback.c), saMmapLoopback.c file does memcpy operation from Capture buffer to Display buffer and that's where you are seeing high CPU consumption. You use USERPTR mode of operation to avoid memcpy operation and use the same buffer between both Capture and Display.

    Please refer to saUserPtrLoopback.c file for reference.

    >>>I've found, additionally, when I enable the composite video out instead of DVI, the calculated framerate drops to 25fps and a large number of fields are missed.

    [Vaibhav] This is not frame drop, your Composite out (rather TV out) standard must be PAL which leads to 25FPS. This is expected behavior. Please keep both standard (incoming & outgoing streams) same, either NTSC or PAL.

     

    >>>Could there be something with our hardware or software setup at a very low level, like DDR timings, or cache mis-calibration, or something else systematic that

    >>>would cause the CPU to use so many cycles doing something fairly simple?

    [Vaibhav] I think I have answered all your question above which are more-or-less expected behavior so I don't expect it to be any Cache/DDR related issues. Rather I would say currently we do not have any data pointers which will lead to this conclusion.

     

    Thanks,

    Vaibhav


  • Vaibhav Hiremath said:

    [Vaibhav] We have to change our ISR routine not to use HS/VS interrupts and only use VD0 & VD1 interrupts, whole buffer processing logic will now dependent on VD0 and VD1 interrupts.

    I've done the same since the beginning.  What do you use VD1 for? I only use VD0.

    Vaibhav Hiremath said:
    [Vaibhav] The bench-marking data which I have shared is captured with different application (saUserPtrLoopback.c), saMmapLoopback.c file does memcpy operation from Capture buffer to Display buffer and that's where you are seeing high CPU consumption. You use USERPTR mode of operation to avoid memcpy operation and use the same buffer between both Capture and Display.

    That makes sense.

    Vaibhav Hiremath said:
    >>>I've found, additionally, when I enable the composite video out instead of DVI, the calculated framerate drops to 25fps and a large number of fields are missed.

    [Vaibhav] This is not frame drop, your Composite out (rather TV out) standard must be PAL which leads to 25FPS. This is expected behavior. Please keep both standard (incoming & outgoing streams) same, either NTSC or PAL.

    This actually is frame/field drop.  I have my ISR print a message when the number of the field received (0 or 1) is not the one that's expected, and I get a number of them, perhaps 5 per second, corresponding to a 5 fps reduction in framerate.  That is why I brought up this behavior in the first place, it manifests itself in the ISR in the same way as the original problem I need to fix.  What would cause my TV-out to be set to PAL? It's not PAL by default is it?  I haven't looked at that as a possibility.

    Thanks

    Dennis

  • Chris,

    I also noticed such behavior, that the CCDC busy flag stays set for a very long time, and that it's almost as if the VD0 interrupt occurs at the beginning of the frame instead of the end.  I wondered if it may be that the VD0 interrupt occurs at the end of the frame, but by the time the ISR occurs & the software begins processing it, that it may already be receiving the next frame.  There seems to be no way to determine the latency of this interrupt, however high latency would seem to be a very good candidate for the cause of the problems we're seeing.

    Dennis

  • Dennis,

    I tested a new daughter card which has the 5150 including VS and HS signals...  During normal operation, it doesn't get any "ccdc idle" timeouts.  I say during normal operation because I think I may have some hardware issues causing glitches in the incoming data which causes timeouts and errors in the video.  I haven't checked to see if there are any dropped fields.

    Which level shifter did you end up using?  Did you put terminating resistors between the level shifter and the overo?  If so, what value?  Right now we're using a 74AVC164245 with 100 ohm resistors inline on the overo side, and 33 ohm on the side going towards the tvp5150.  Originally we tried 33 ohm on both sides without good results, and I saw the pixhawk board used 100 ohm, so we're using that now.

    -chris

     

  • For anyone who might be interested in seeing the video that results from our problem, I've attached an H.264 encoded MP4 file which demonstrates the problem.

    4544.z_av_sample0.mp4

  • Hi Vaibhav,

     

    Could you explain this problem with ISP-CCDC and bt656 a bit further?

    I We've built a board using the tvp5151 and omap3530.


    I'm using AM35x-OMAP35x-PSP-SDK-03.00.00.05, and have ported the tvp514x-int.c for the 5151, and I'm seeing a very similar issue.

    I expected  to follow the EVM closely, but that does not seem to be the case. It appears that the EVM is using SYNC format from the tvp5146.  As the 3530 Torpedo does not connect CAM_FLD, I don't believe that I can do this.  If that's correct, and the OMAP cannot accept bt656, I've got a problem...

    Did you work out the issue with generating the HS/VS interrupts for BT656 streams?

     

    Thanks,

    Joel.

     

  • Hi Joel,

    Let me summarize my previous findings (which has been conformed from Hardware expert) here once again,

    First of all,

    OMAP3 does support BT656 interface and it does decode incoming SAV and EAV bit fields without any issues, which has been validated with all PSP releases. On EVM, all sync signals like HS, VS and FLD has been interfaced to OMAP3 CCDC, so all along until now we were configuring TVP5146 to generate these signals along with BT656 streams and the ISP ISR routine is also implemented considering these 4 interrupts (HS, VS, VD0 and VD1).

    Now the issue/finding here is, if you don't connect HS and VS signals to OMAP3 CCDC, the input formatter won't be able to generate HS & VS interrupts. It has been conformed from Hardware team that input formatter is going to generate interrupts only on external HS/VS signals. I have already raised the issue to update this important information in the TRM which has been accepted and likely to get updated in next version of TRM.

    Now what that means is, the ISP ISR has to be changed to only deal with VD0 and VD1 interrupts.

     

    I hope above information clarifies all your doubts, please let me know if you have any issues.

    Thanks,

    Vaibhav

     

  • hello,we are working on beagleboard-xm platform,we are using TVP5151,why do you have to modify the 5146 rahter than 5150? Only do the isp support 5146? we also want to capture video through the V4L2 interface, and encoding H.264 using the DSP.  how can you do it ? can you give me you source ? thank you very much !