This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-J721E: TDA4X UDMA transfer errors on C66x

Part Number: PROCESSOR-SDK-J721E

I have implemented UDMA transfer application which does DMA copy from DDR to DDR in TDA4X.

Currently I am trying to perform ND copy and 2D copy to do a simple mem-copy transfer.

I have taken the reference of  app_udma_test.c present in  vision_apps/uttils/udma/ file for 2D and ND transfer. when I try to give the size of buffer below 260KB then the UDMA transfer is getting successful, but when the size is above 260KB

For example : for copying buffer of 512*1024 using UDMA, I am facing MEMORY FAULT (Core Dumped) error. I have tried the same with 2D and ND transfers but the result is same.

QUESTIONS :

1) Is there any limitation of size in UDMA transfer?

2) Is there any available document, which gives the explanation about DMA transfer, size limitations etc.

3) If there is a limitation of size then what you would suggest to do a memcpy of buffers with large size ( greater that size limitation )

Can you please guide regarding the same.

Thank You.

Vipul Kulkarni.

  • hi Vipul,

    1) Is there any limitation of size in UDMA transfer?

    No, 512KB is no limitation in the size of the UDMA transfer. In fact, there are drivers which use UDMA to transfer MBs of data. So there should be something wrong in the way the UDMA is getting configured. Can you share the details of parameters? 

    2) Is there any available document, which gives the explanation about DMA transfer, size limitations etc.

    Please refer to the TRM for this. 

    3) If there is a limitation of size then what you would suggest to do a memcpy of buffers with large size ( greater that size limitation )

    Again, what is the size of the buffer that you want to do memory copy?

    Regards,

    Brijesh

  • Hello Birjesh,

    Thanks for the reply.

    Actually I want do a memcpy of 1280*944 YUYV format  image. For doing the same I am using ND UDMA copy. Following is the configuration for the UDMA. In the below configuration I am transferring 1280*2*512 ( just for checking the size limit )

    static void memcpyC66(uint8_t *pOut, uint8_t *pIn, uint8_t *tmp_buf, int32_t width, int32_t height)
    {
    app_udma_copy_nd_prms_t prms_nd_in;
    app_udma_copy_nd_prms_t prms_nd_out;
    app_udma_ch_handle_t udmaChIn;
    app_udma_ch_handle_t udmaChOut;

    appUdmaCopyNDPrms_Init(&prms_nd_in);

    appUdmaCopyNDPrms_Init(&prms_nd_out);


    prms_nd_in.copy_mode = 2;
    prms_nd_in.src_addr = appMemGetVirt2PhyBufPtr((uint64_t) pIn, APP_MEM_HEAP_DDR);
    prms_nd_in.dest_addr = appMemGetVirt2PhyBufPtr((uint64_t) tmp_buf, APP_MEM_HEAP_DDR);
    prms_nd_in.icnt0 = width;
    prms_nd_in.icnt1 = height;
    prms_nd_in.icnt2 = 32;
    prms_nd_in.icnt3 = 1;
    prms_nd_in.dim1 = width*2;
    prms_nd_in.dim2 = (height * width*2);
    prms_nd_in.dim3 = (height * width*2 * 32);

    prms_nd_in.dicnt0 = prms_nd_in.icnt0;
    prms_nd_in.dicnt1 = prms_nd_in.icnt1;
    prms_nd_in.dicnt2 = 32; /* Ping-pong */
    prms_nd_in.dicnt3 = 1;
    prms_nd_in.ddim1 = width*2;
    prms_nd_in.ddim2 = (height * width*2);
    prms_nd_in.ddim3 = 0;

    prms_nd_out.copy_mode = 2;
    prms_nd_out.src_addr = appMemGetVirt2PhyBufPtr((uint64_t) tmp_buf, APP_MEM_HEAP_DDR);
    prms_nd_out.dest_addr = appMemGetVirt2PhyBufPtr((uint64_t) pOut, APP_MEM_HEAP_DDR);
    prms_nd_out.icnt0 = width;
    prms_nd_out.icnt1 = height;
    prms_nd_out.icnt2 = 32; /* Ping-pong */
    prms_nd_out.icnt3 = 1;
    prms_nd_out.dim1 = width*2;
    prms_nd_out.dim2 = (height * width*2);
    prms_nd_out.dim3 = 0;

    prms_nd_out.dicnt0 = prms_nd_out.icnt0;
    prms_nd_out.dicnt1 = prms_nd_out.icnt1;
    prms_nd_out.dicnt2 = 32;
    prms_nd_out.dicnt3 = 1;
    prms_nd_out.ddim1 = width*2;
    prms_nd_out.ddim2 = (height * width*2);
    prms_nd_out.ddim3 = (height * width*2 * 32);

    udmaChIn = appUdmaCopyNDGetHandle(0);
    udmaChOut = appUdmaCopyNDGetHandle(1);

    appUdmaCopyNDInit(udmaChIn, &prms_nd_in);
    appUdmaCopyNDInit(udmaChOut, &prms_nd_out);

    int i;
    for(i=0; i<32; i++)
    {
    appUdmaCopyNDTrigger(udmaChIn);
    appUdmaCopyNDWait(udmaChIn);
    appUdmaCopyNDTrigger(udmaChOut);
    appUdmaCopyNDWait(udmaChOut);
    }
    appUdmaCopyNDDeinit(udmaChIn);
    appUdmaCopyNDDeinit(udmaChOut);
    }

    As per this configuration, I am currently doing memcpy of 1280 * 512 * 2 ( just for checking usage ). in each transfer I am executing 1280*2*16 and doing for 32 iterations. So I have selected icnt2 as 32. Please take a look at this and let me know if anything is not correct.

    Thanks,

    Vipul Kulkarni.

  • Hi Vipul,

    But why do you want to copy small size 32 times? Why can't you copy entire 1280*944*2 bytes in one shot using 2D copy function? 

    Regards,

    Brijesh

  • Hi Birjesh,

    Actually, first I tried normal 2d copy of 1280*944*2 but it was giving memory fault core dumped error and it was dumping only 260 kb buffer. After that I tried Nd copy.

    Can you tell what might be the reason of the memory fault with 2d copy.

    Thanks,

    Vipul Kulkarni

  • Hi Birjesh,

    This is the configuration for 2D transfer:

    appUdmaCopy2DPrms_Init(&prms_2d);
    prms_2d.width = width; //1280
    prms_2d.height = height; //944
    prms_2d.dest_pitch = width*2; //1280*2
    prms_2d.src_pitch = width*2; //1280*2
    prms_2d.dest_addr = appMemGetVirt2PhyBufPtr((uint64_t) pOut, APP_MEM_HEAP_DDR);
    prms_2d.src_addr = appMemGetVirt2PhyBufPtr((uint64_t) pIn, APP_MEM_HEAP_DDR);
    appUdmaCopy2D(NULL, &prms_2d, 1);

    With this config. when I was doing transfer for the images which has size less than 260 kb then error was not there but above that some meory fault error was there.

    Can you please help regarding the issue.

    Thanks

  • Hi Vipul,

    I see you are setting pitch as width x 2 whereas width is same as width. When you are doing DMA transfer width should be in terms of bytes. 

    I am not exactly if this is the reason for the memory fault, but its definitely not correct. 

    Are your buffer address, pIn and pOut correct and well separated? can you check them?

    Regards,

    Brijesh 

  • Hi Birjesh,

    Yes Birjesh. I have printed the addresses for input and output, addresses are in the range of ddr shared memory.

    about pitch, yes you are right but for smaller resolution image is getting dumped correctly. I tried for 1280*2*64 for this resolution image is getting dumped correctly and no memory fault error also.

    Thanks,

    Vipul

  • Hi Birjesh, 

    Can you please tell if any existing example which does 2d or ND dma tranafer from ddr to ddr.

    Thank You.

    Vipul

  • Hi Vipul,

    Other than UDMA Test example application, i dont see these APIs being used anywhere in the SDK. Can you please refer to this example API appUdmaTest2DCopy in file ti-processor-sdk-rtos-j721e-evm-08_06_00_12\vision_apps\utils\udma\src\app_udma_test.c ?

    Regards,

    Brijesh

  • Hi Birjesh,

    Yes, I refered the same example, but with that only some memory fault error is there.

    Thanks,

    Vipul

  • On which core are you seeing this error? Are the buffers getting allocated properly? Are you using the same API to test this feature? 

    Regards,

    Brijesh

  • Hi Birjesh,

    Currently I am trying this on C66x core and bufferes also getting allocated properly. I have printed the address for the input and output ptr and the addresses are as below :

    For input  =  0xAE000000 which is nothing but the starting address of ddr shared memory for C66x

    For output = 0xAE140000 which is after 1280*512*2 size on c66x.

    And the api which i used is appUdma2dCopy().

    Thanks,

    Vipul.

  • Vipul,

    This does not look to be correct location. According to J721E memory map, this region is used by the OpenVX in SDK8.6 and SDK9.0.  This should not be used for buffer allocation, as it can corrupt OpenVX memory and affect functioning of it. 

      

    Regards,

    Brijesh

  • Hi Birjesh,

    So, According to this where should I allocate the memory for input and output buffers?

    Thanks

    Vipul

  • Hi Birjesh,

    Actually By mistakenly I have marked it as resolved

    but when I checked memory map, the starting address of OpenVX memory is  0XAC040000 and length is of 31.62 MB (0x01FA0000) and starting address of DDR memory is from AE000000

    So according to this it should not collide with each other.

    Thanks,

    Vipul.

  • Hi Vipul,

    It should be allocated from the local heap of C6x, which starts from 0xDC000000 for C6x_1. Looking at the code, it should have been allocated from this section, when you use appMemAlloc with APP_UDMA_HEAP_ID Heap Id, can you please check and confirm if you are using this? 

    Regards,

    Brijesh

  • but when I checked memory map, the starting address of OpenVX memory is  0XAC040000 and length is of 31.62 MB (0x01FA0000) and starting address of DDR memory is from AE000000

    ok, which SDK release are you using? Have rearranged memory map from the SDK release?

  • Hi Birjesh,

    Thanks for quick reply.

    Actually I have printed the address after calling the function appMemGetVirt2PhyBufPtr(pIn, APP_MEM_HEAP_DDR). but the address is same as previous i.e 0xAE000000. I think it should start from 0xDC000000 as you mentioned, because we are giving APP_MEM_HEAP_DDR.

    Thanks,

    Vipul.

  • Hi Birjesh,

    SDK version is 7.1.we did some memory map changes previously.

    Thanks,

    Vipul

  • oh, that's very old release. is it possible to try it out on latest release? 

    Even on SDK7.1, can you try using local heap on C6x, like below? 

    Regards,

    Brijesh

  • Hi Birjesh,

    Yes we will try to allocate in the local heap once. Will let you know.

    Thanks,

    Vipul

  • Thanks, i will move this ticket to waiting state.

  • Hi Birjesh,

    As you mentioned we have tried by allocating the memories in local heap region , but still the issue is same. we have printed the addresses and that also looks correct.

    Is it because of any other reason this issues comes?

    Thanks,

    Vipul.

  • Hi Vipul,

    Do you have JTAG + CCS connection on this board? Then, can you connect to C6x core and see where exactly it crashes? Also, can you dump TR memory and share it?

    Regards,

    Brijesh 

  • Hi Birjesh,

    Thank You very much for your time. 

    Actually, we added error handling after dma transfer to check wheather the dma transfer is completed successfuly or not. We found out that DMA transfer was not the isssue, so we checked reading and dumping logic of image was the issue because of issue in the mappping function. 

    Now that issue is resolved.

    Thanks and regards,

    Vipul