Other Parts Discussed in Thread: TVP7002
Hi everybody,
I'm working on KIT DM8168 with EZSDK. I want to build an application which contains video capture, image processing and video streaming.Could anyone give some advice about designing software architecture ?
Requirements:
+ Capture from tvp7002 (camera HD analog): frame 1280x720
+ Run image processing on DSP using some special algorithms
+ Streaming frame results using gstreamer (25 fps)
Below is my current architecture:
+ On ARM : 2 applications
Main app: main application (capture video, communicate with core DSP and streamer process)
LiveStreamer: run gstreamer pipeline with gstappsrc
+ On Core DSP (DSP app): run image processing algorithms
About communication among applications:
+ Main app and DSP app : use syslink notify and shared region.
I choose shared region IPC_SR_COMMON (addr = 0x9F700000, length = 2MB). Main app captures frame into shared region, notify to DSP app to run algorithms. DSP app run algorithms and notify to main app when finish processing.
+ Main app and LiveStreamer app: in LiveStreamer I use memory mapping to read data directly from physical address of shared region (0x9F700000). LiveStreamer run a pipeline, with GstAppSrc to push frame data into.
I use a separate process to run gstreamer because I can't make Qt main loop in main application and gstreamer work together. Moreover, I think I can start/stop gstreamer easily without worry about memory management.
Currently, my system works but with bad performance: Video stream only 8 fps, ARM load 100%.
This is 3 main problems:
+ In LiveStreamer: GstAppSrc need to create new GstBuffer to copy frame data
GstBuffer *buffer = gst_buffer_new();
uint8_t *tempBuffer = (uint8_t*)malloc(app->length);
memcpy(tempBuffer, app->imgBuf, app->length); // copy frame data into new buffer
GST_BUFFER_MALLOCDATA(buffer) = tempBuffer;
GST_BUFFER_SIZE(buffer) = app->length;
GST_BUFFER_DATA(buffer) = GST_BUFFER_MALLOCDATA(buffer);
GstFlowReturn ret = gst_app_src_push_buffer(app->appSrc, buffer);
I can't find any solutions to push data without allocate new buffer.
+ GstAppSrc read data from shared region and copy to new GstBuffer.
memcpy on ARM is very slow (100 ms for copy frame 1280x720).
+ Shared region is cached. If I only use image processing, turn off LiveStreamer, algorithms on DSP run faster. If enable LiveStreamer (use memory mapping), algorithms run 6->10 ms slower. Moreover, region IPC_SR_COMMON is very small, how can I create large shared region for Host and MC-HDVICP2 and MC-HDVPSS.
Could anyone give me some advice to solve above problems or suggest another software architecture for my system ?