Hi all,
I am working with TI8168 with EZSDK ver 5.03, My code is based on the OMX example capture_encode, i need the A8 to perform some minor calculations on the HD buffer, these calculations are perfomed in a frame rate of 1fps, and this is because shared memory region 2 where video frames are allocated is not cachable by A8.How can i make shared region 2 cacheable for the cortex A8 ?
Thanks,Gabi
Hi Gabi,
Have you tried digging into the OpenMAX code? If you follow the OMX_AllocateBuffer call in the user application, you see that this calls omxproxy_alloc_buffer which is defined in ti-ezsdk_dm816x-evm_5_04_00_11\component-sources\omx_05_02_00_38\src\ti\omx\domx\OmxProxy.c.
This calls OmxRpc_stubAllocBuffer (pComponentPrivate->rpcHandle, (OmxCore_OMX_BUFFERHEADERTYPE *) pBufferHeader, nPortIndex, pAppPrivate, nSizeBytes, (OmxTypes_OMX_U8 **) & pBuffer, (OmxTypes_OMX_U32 *) & pBufferMapped, (OmxCore_OMX_BUFFERHEADERTYPE **) & pBufHeaderRemote, (OmxCore_OMX_ERRORTYPE *) & eError);
Which calls:
OMXRPC_STUB_TEMPLATE (pRpcHndl, pRpcMsg, retVal, OmxRpc_cmdAllocBuffer, allocbuffer, (&pRpcMsg->api.allocBuffer, nPortIndex, pAppPrivate, nSizeBytes), (&pRpcMsg->api.allocBufferResponse, ppBufHeaderRemote, pBufferHdr, &nBufferMapped, *nCmdStatus), nCmdStatus);
omxrpc_msg_marshall_allocbuffer(&pRpcMsg->api.allocBuffer, nPortIndex, pAppPrivate, nSizeBytes)
&pRpcMsg->api.allocBuffer->nSizeBytes = nSizeBytes
pRpcMsg = &rcmMsg->data
OmxRpc_rcmExec ((RcmClient_Handle)pRpcHndl->client.handle, \ rcmMsg, \ fxnId);
This calls RcmClient_exec (rcmHndl, rcmMsg, &rcmMsg);
and then we get into the magical work of IPC. Presumably the media controller reserves the memory over IPC/Syslink. If you can find out the way in which this memory is reserved you might be sorted, but this would involve tapping into the IPC messages from the media controller.
Ultimately, I think you'd have to create a mapping in Linux such that you can mark the reserved memory as cached. I think CMEM might be able to do this so it might be worth looking at its source code to see how you can mark Shared Region 2 as cached.
This probably doesn't help you much, but I thought some response was better than none... :-/
Ralph
Hi,
A8 buffers from SR2 can be made cacheable by modifying board-support/media-controller-utils_2_05_00_17/src/firmware_loader/memsegdef_default.c
/* Segment 3 */
{
1, /* valid */
"IPC_SR_FRAME_BUFFERS", /* name */
0x0BC00000, /* size */
LDR_SEGMENT_TYPE_DYNAMIC_SHARED_HEAP, /* seg_type */
0, /* flags */
0xB3D00000, /* system_addr */
0xB3D00000, /* slave_virtual_addr */
1, /* master_core_id */
(1 << LDR_CORE_ID_VM3) | (1 << LDR_CORE_ID_DM3) | (1 << LDR_CORE_ID_A8),
/* core_id_mask */
(1 << LDR_CORE_ID_VM3) | (1 << LDR_CORE_ID_DM3), /* cache_enable_mask */
You can change the cache enable mask to add cache enable on A8 (1 << LDR_CORE_ID_VM3) | (1 << LDR_CORE_ID_DM3) | (1 << LDR_CORE_ID_A8),
After this you need to build firware_loader. { make media-controller-utils }. Use this firmware loader in filesystem /usr/bin folder. This should enable the cache for buffers on A8 side.
Please note cache invalidations are not done in A8 side in DOMX, so if cache gets enabled data might not look fine. DOMX needs to be modified to take care of cache invalidation.
Regards
Vimal
Hi Vimal,
Thank you very much for your answer.I am doing a loopback via A8 buffer, based on capture_encode example cap->DEI->display, i am allocating a buffer in shared region 2 as explained in that thread:
http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/717/p/203642/734670.aspx#734670
Now i copy from DEI output buffer to A8 buffer and from A8 buffer to Display input buffer, the copyng is done using EDMA thus frame rate remain 60 fps, Now instead of just loop-back i perform some minor processing on the video frame, the frame rate drops to 1 fps. After i am making shared region 2 cache-able to A8 the way you have described, the frame rate remain 1 fps, how can it be? Is this the only thing need to be done? Is there another way to speed up the A8?
Gabi,
Have you profiles the minor processing, may be that is not able to give buffers ontime to next component, you can try providing more buffers, and do not operate on immediate buffer, but keep pipeline running, so no component spend time on waiting for buffer. May be you are aready taking care of that.
Thank for your answer, I am using 5 buffers (similar to what i have used with all OMX examples), i have cancelled for the polling on EDMA transfers before using the buffers copied by EDMA risking in problems with video frames, therefore nothing stops the pipeline but still 1 fps.If insted of processing 1080 rows and (1920<<1) coloumns i process 1080 rows and (200<<1) coloumns frame rate is 6 fps.
Gabi