This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DM816x OpenMAX SDK 5.0.0.8 + 512MBytes DDR3

The DM816x EVM uses 1GByte of DDR3 memory.

I am trying to get the DM816x OpenMAX SDK 5.0.0.8 running

in a 512MByte memory footprint, using the Linux A8 host

to run my decode from file  application, which is working

well with 1GByte of DDR3 memory.

So far I have done these steps.

1) Change platformMem to 512M in  packages/ti/omx/build/config.bld
and packages/ti/omx/build/config_ca8.bld.

 

2) Hard-coded disableTiler to TRUE in

packages/ti/omx/omxutils/ducatiTilerMemMgr/DucatiTilerMemoryMgr.xdc

to get around a build issue.

 

3) Modifying omx/build/ducatiplatform_512MB.xs and omx/demos/vc3/src/app_cfg.h

for 512MB memory map.  The resulting memory map:

         name            origin    length     
----------------------  --------  --------- 

  VECS_CORE_0           00000000   00000400 
  VECS_CORE_1           00000400   00000c00 
  L2_SRAM               00001000   0003f000 
  OCMC_RAM0             40300000   00040000 
  OCMC_RAM1             40400000   00040000
  CONFIG_REGISTER_SPACE 41000000   1f000000 
  TILER_SYSTEM_SPACE    60000000   20000000 
  LINUX_PHYSICAL_ADDRES 80000000   04000000 
  SHARED_CTRL           84000000   01000000 
  EXTMEM_CORE0          85000000   00200000 
  EXTMEM_CORE1          85200000   00200000 
  EVENT_LIST_CORE0      85400000   00a00000 
  EVENT_LIST_CORE1      85e00000   00a00000 
  PRIVATE_CORE0_DATA    86800000   07800000 
  PRIVATE_CORE1_DATA    8e000000   07000000 
  SHARED_CTRL_DUCATI_ON 95000000   00b00000 
  SHARED_DATA           95b00000   00100000 
  HDVPSS_DESCRIPTOR_NON b5c00000   00200000 
  FBDEV_V4L2_Mem        b5e00000   00200000 
  FQMEM_BUFFERS_NON_CAC b6000000   0a000000 

Note that the uncached mappings are still set up in the 0xBxxx.xxxx space,

because the Ducati AMMU memory space uses large pages (512MB) to

map 0x8000.0000-0x9fff.ffff cached, and 0xa000.0000-0xbfff.ffff uncached.

The DDR3 controller will alias these addresses down to 0x9xxx.xxxx, so it will work OK.

4) Now I can build my decode from file application (known to work with the 1GByte configuration) and run it.

However, the decode process fills 5 buffers, then hangs.

I get a similar result if I run the 1Gbyte configuration with g_DisableTiler = TRUE.

What am I missing -- is there some other change needed to disable the tiler

in syslink, or the HDVPSS, or the OpenMAX stack?

 

 

  • Hi,

     

    I dont think this issue has anything to do with memory allocation with tiler as any wrong config there would have manifested as an alloc failure. You memory map looks fine except for the fact that you are allocating lesser memory for Linux than that is recommended (128MB). But looking at the app you are trying to run, that should not impact much.

    As you are setting the mem_plaftorm variable to 512MB, the variable disableTiler  would get automatically disabled i the script TilerCfg.cfg. Also, please make sure that you are setting the memtype for all involved OMX ports to Default memory. Please look at the file ti\omx\demos\vs2\src\omx_prop_tunnel_test.c on how to use the Index ‘OMX_TI_IndexParamBuffMemType’ for doing this.

    As you are seeing this in runtime, let me ask you the following questions:

    1. Are you enabling 'APP_TYPE = vs' in the app_properties.mk for your app?

    2. If so, How many buffers are available to the decoder at the input port or the output port?

    An indicator for this would be the value of nBufferCountActual for the output port connected to the decoder input port or the nBufferCountActual  for the decoder Output port.

    If it’s greater than 4 for either of them, then I can see that there would be a problem. The input/output OMX Buf Headers are derived from  global arrays whose lengths are 4. 

    If say the number of the Headers is set to 5 (from the App), then there would be case of memory corruption.

    Can you try with reducing these to 4 to confirm if this is the issue? 

    Regards,

    Archith

     

  • In addition to Archit's points can you confirm that you have set mem=64M in the kernel bootargs since you have reduced the linux physical memory.Also please share  changes if any you have done related to DMM interleaving

    The changes you have done are correct. To help resolve the issue, is it possible to share the modified files related to memory map change .We can then recreate the problem locally and debug the issue.

  • Archith,

    > 1. Are you enabling 'APP_TYPE = vs' in the app_properties.mk for your app?

    No, I used the vc3 demo as my starting point, I am used its app_properties.mk unchanged:

    # Max resolution supported in this app
    MAX_RESOLUTION = hd

    # IPC Modes:
    #     local: only this core
    #     remote: only intra-ducati (between two M3 cores)
    #     remoteWithHOST: ducati cores and host (A8)
    IPC_MODE = remoteWithHOST

     

    2. If so, How many buffers are available to the decoder at the input port or the output port?

    An indicator for this would be the value of nBufferCountActual for the output port connected to the decoder input port or the nBufferCountActual  for the decoder Output port.

    If it’s greater than 4 for either of them, then I can see that there would be a problem. The input/output OMX Buf Headers are derived from  global arrays whose lengths are 4. 

    If say the number of the Headers is set to 5 (from the App), then there would be case of memory corruption.

    Can you try with reducing these to 4 to confirm if this is the issue?

    I was already setting the number of VIDDEC input buffers to 4:

            eError = omx_ilclient_utl_test_set_port_params_numbufs (pContext->pnDecHandle[0], 
                                                                    OMX_VIDDEC_INPUT_PORT,
                                                                    4);

    If I take this out, the default number of buffers seems to be 3,

    but it still hangs.

     

    By attaching Code Composer to the VPSS M3, I was able to get some more

    interesting debug on the problem.

    The VPSS M3 is hitting an assert in packages/ti/omx/comp/vfpc/src/omx_vfpc_DEI_if.c, line 192:

        format = &drvState->deiConfig.chParams[chNum].inFmt;
        pitch = pOmxBufHdr->nFramePitch;
        height = pOmxBufHdr->nFrameHeight;
        width = pOmxBufHdr->nFrameWidth;

        Assert_isTrue ( ( pitch == format->pitch[FVID2_YUV_INT_ADDR_IDX] ),
                        Assert_E_assertFailed );

    At this moment, pitch (taken from pOmxBufHdr->nFramePitch) is 2048,

    and format->pitch[FVID2_YUV_INT_ADDR_IDX] is 1920.

     

    I think the padding is different for tiled vs. raw, maybe it is not taken into account somewhere?

     

    John

  • John,

     You are right when using Tiled memory is always 16384 for the 8bit container. The decoder pads the output buffer and due to this 1920 becomes 2048. This is not a problem when using tiler because pitch is always fixed. To confirm if this is the issue, can you try the following change in the IL Client (omx_prop_tunnel_test.c in vc3/src folder:

          vfpcCfg.nFramePitchInput  = (((FRAME_WIDTH + 128) + 127)/ 128) * 128;

    This should be done before the call to omx_ilclient_utl_vfpc_create_config for the DEI connected to decoder output.For example if DEI 0 is connected to decoder output,

    code should be as below:.

     

        if (0 == i) {
          vfpcCfg.nFrameWidthInput =  FRAME_WIDTH;
          vfpcCfg.nFrameHeightInput = FRAME_HEIGHT;
          vfpcCfg.nFrmStartX = 36;
          vfpcCfg.nFrmStartY = 24;
          vfpcCfg.nFrmCropWidth = vfpcCfg.nFrameWidthInput - vfpcCfg.nFrmStartX;
          vfpcCfg.nFrmCropHeight = vfpcCfg.nFrameHeightInput - vfpcCfg.nFrmStartY;
          vfpcCfg.nFrameWidthOutput0  = g_DisplayWindowWidth;
          vfpcCfg.nFrameHeightOutput0 = g_DisplayWindowHeight;
          vfpcCfg.nFramePitchOutput1  = (FRAME_WIDTH * 2);
          vfpcCfg.bAlgBypass = OMX_TRUE;
          vfpcCfg.nFramePitchInput  = (((FRAME_WIDTH + 128) + 127)/ 128) * 128;
        } else {
          vfpcCfg.nFrameWidthInput =  g_CaptureWidth;
          vfpcCfg.nFrameHeightInput = g_CaptureHeight;
          vfpcCfg.nFrmStartX = 0;
          vfpcCfg.nFrmStartY = 0;
          vfpcCfg.nFrmCropWidth = vfpcCfg.nFrameWidthInput - vfpcCfg.nFrmStartX;
          vfpcCfg.nFrmCropHeight = vfpcCfg.nFrameHeightInput - vfpcCfg.nFrmStartY;
          vfpcCfg.nFrameWidthOutput0  = g_DisplayWindowWidth1;
          vfpcCfg.nFrameHeightOutput0 = g_DisplayWindowHeight1;
          vfpcCfg.nFramePitchOutput1  = FRAME_WIDTH;
    #if VIP1080PCAPTURE
          vfpcCfg.bAlgBypass = OMX_TRUE;
    #else
          vfpcCfg.bAlgBypass = OMX_FALSE;
    #endif
          vfpcCfg.nFramePitchInput  = FRAME_WIDTH;
        }
        vfpcCfg.nFramePitchOutput0  = g_DisplayWindowPitch;
        vfpcCfg.nFrameWidthOutput1  = FRAME_WIDTH;
        vfpcCfg.nFrameHeightOutput1 = FRAME_HEIGHT;
       
        eError = omx_ilclient_utl_vfpc_create_config (&(pContext->hVFPCDE[i]),
                                                        pContext,
                                                        &(pContext->oCallbacks),
                                                        szCompName,
                                                        &(pContext->hvfpcDEStateChangeEvent[i]),
                                                        &vfpcCfg);
        OMX_TEST_BAIL_IF_ERROR ( eError );

  • Yes, I have mem=64M in the kernel bootargs.

    I kept the default DIMM interleaving in U-Boot, even though the sizes are set for 1Gigabyte of DDR3.

            /* Program the DMM to for interleaved configuration */
            __raw_writel(0x80640300, DMM_LISA_MAP__0);
            __raw_writel(0xC0640320, DMM_LISA_MAP__1);
            __raw_writel(0x80640300, DMM_LISA_MAP__2);
            __raw_writel(0xC0640320, DMM_LISA_MAP__3);

    I figured if we don't access the upper memory, it didn't matter.

    It seems that my problem happens when disabling the tiler,

    regardless of 1GByte or 512MByte of physical DDR3.

  • It seems I already had the same code

          vfpcCfg.nFramePitchInput  = (((FRAME_WIDTH + 128) + 127)/ 128) * 128;

    which results in a value of 2048 for the pitch.

    Since you indicated that 2048 is the correct buffer pitch, I looked at the

    format->pitch[FVID2_YUV_INT_ADDR_IDX], which was being set in _OMX_VFPCDeiDualOutCreate()

    to

         VpsUtils_align ( OMX_VFPC_DEFAULT_INPUT_FRAME_WIDTH, 16 );

    which works out to 1920.

     

    I made a patch to use the user-specified pitch to set pitch[FVID2_YUV_INT_ADDR_IDX]

    as follows:

    --- a/packages/ti/omx/comp/vfpc/src/omx_vfpc_DEI_if.c
    +++ b/packages/ti/omx/comp/vfpc/src/omx_vfpc_DEI_if.c
    @@ -844,6 +844,7 @@ OMX_ERRORTYPE _OMX_VFPCDeiDualOutCreate ( OMX_PTR pDrvPrivObj,
       // deiConfig->inHeight = DEI_IN_HEIGHT;
       deiConfig->inWidth = drvState->DeiDynamicCfg[0].nInWidth;
       deiConfig->inHeight = drvState->DeiDynamicCfg[0].nInHeight;
    +  deiConfig->inPitch = drvState->DeiDynamicCfg[0].nInPitch;^M
       deiConfig->FrmStartX = drvState->DeiDynamicCfg[0].nFrmStartX;
       deiConfig->FrmStartY = drvState->DeiDynamicCfg[0].nFrmStartY;
       deiConfig->FrmCropWidth = drvState->DeiDynamicCfg[0].nFrmCropWidth;
    @@ -1299,9 +1300,9 @@ static Void vpsVfpcDeiDualSetDefaultChParams ( omxVfpcDeiDualConfig_t *
         }
         else {
           chParams->inFmt.pitch[FVID2_YUV_SP_Y_ADDR_IDX] =
    -          VpsUtils_align ( OMX_VFPC_DEFAULT_INPUT_FRAME_WIDTH, 16 );
    +          VpsUtils_align ( deiConfig->inPitch, 16 );^M
           chParams->inFmt.pitch[FVID2_YUV_SP_CBCR_ADDR_IDX] =
    -          VpsUtils_align ( OMX_VFPC_DEFAULT_INPUT_FRAME_WIDTH, 16 );
    +          VpsUtils_align ( deiConfig->inPitch, 16 );^M
         }
         if ( FVID2_SF_INTERLACED == deiConfig->inScanFmt ) {
           chParams->inFmt.fieldMerged[FVID2_YUV_SP_Y_ADDR_IDX] = TRUE;
    --- a/packages/ti/omx/comp/vfpc/src/omx_vfpc_DEI_if.h
    +++ b/packages/ti/omx/comp/vfpc/src/omx_vfpc_DEI_if.h
    @@ -188,6 +188,9 @@ extern "C"
         /**< Input height. */
         UInt32 inHeight;
     
    +    /**< Input height. */^M
    +    UInt32 inPitch;^M
    +^M
         /**< Frame start X offset */
         OMX_U32 FrmStartX;

     

    With this change, now the decode is running fine in the 512MByte configuration.

    Thanks for the pointers.