This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[TDA4] Deploy big neural net on C7

Hello,

I'm trying to deploy a neural net on the C7 where the memory need for some layers exceed DDR_SCRATCH_SIZE (defined in app_cfg_c7x_1.h). This leads to the error

Failed to Allocate memory record 7 @ space = 17 and size = 39642624 !!!

on the target. I could increase DDR_SCRATCH_SIZE to 35 MB, but that is still not sufficient.

I tried decreasing DDR_HEAP_MEM_SIZE in favor of DDR_SCRATCH_SIZE and the memory could successfully be allocated, but that led to the process function of the tidl node running forever (with no error output).

I also tried to change the respective size in the memory map by following

but this did not work. I expect I have to rebuild linux using the updated dtsi (which I haven't done)? Is there some tutorial how to do that? Is this the recommended procedure?

Is there any way to increase the allowed net size to more than 35 MB?

Regards

Dom

  • Hi Dom,

    Yes you can refer to developer notes section 7.2 "Understanding and updating SDK memory map for j721e" which tells some steps to follow for increasing scratch memory size for C7x DSP.

    psdk_rtos_auto_j7_06_01_00/psdk_rtos_auto/docs/user_guide/developer_notes_memory_map.html

    Regards,
    Shyam

  • Hi Dom,


    Here are some NEW steps to help increase the C7x local heap/scratch memory. We've seen this work for bigger models, can you please try this and let me know?

    Get a fresh setup of psdk_rtos_auto_j7_06_01_00_15 and ti-processor-sdk-linux-automotive-j7-evm-06_01_00_05 installed on Ubuntu 18.04 machine.

    For changing the memory map and allocate more memory to C7x heap/scratch follow these NEW steps.

    1. Under ti-processor-sdk-linux-automotive-j7-evm-06_01_00_05/ make sure you have run setup.sh
    2. Navigate to ti-processor-sdk-linux-automotive-j7-evm-06_01_00_05/board-support/linux-4.19.73+gitAUTOINC+0cabba2b47-g0cabba2b47/arch/arm64/boot/dts/ti and open file k3-j721e-vision-apps.dtso
    3. Got reserved_memory section and modify as below,
      &reserved_memory {
          #address-cells = <2>;
          #size-cells = <2>;

          vision_apps_memory_region: vision_apps-dma-memory@b8000000 {
              compatible = "shared-dma-pool";
              reg = <0x00 0xb8000000 0x00 0x02000000>;
              no-map;
          };

          vision_apps_shared_region: vision_apps_shared-memories@bc000000 {
              compatible = "shared-dma-pool";
              reg = <0x00 0xbc000000 0x00 0x10000000>;
          };

          vision_apps_shared_region_1: vision_apps_shared-memories_1@cc000000 {
              compatible = "shared-dma-pool";
              reg = <0x00 0xcc000000 0x00 0x14000000>;
          };
      };
      This will effectively reduce vision_apps_shared-memories and carve out space for a new C7x heap/scratch. Effectively the shared-memories will be reduced to 256MB and 320MB will be set aside as C7x heap/scratch.
    4. Now navigate back to ti-processor-sdk-linux-automotive-j7-evm-06_01_00_05/ and build the new k3-j721e-vision-apps.dtbo by doing
      ti-processor-sdk-linux-automotive-j7-evm-06_01_00_05 > make linux-dtbs
    5. Upon successful build, copy the newly generated k3-j721e-vision-apps.dtbo and replace the existing one on SD card rootfs/boot/ folder
    6. Navigate to psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demos/app_tirtos/tirtos_linux/c7x_1 and update the linker_mem_map.cmd to add new C7X_SCRATCH_MEM section as shown below and reduce the DDR_SHARED_MEM as shown below
          /* Memory for shared memory buffers in DDR [ size 256.00 MB ] */
          DDR_SHARED_MEM                    : ORIGIN = 0xBC000000 , LENGTH = 0x10000000
          /* Memory for C7x heap/scratch buffers in DDR [ size 320.00 MB ] */
          C7X_SCRATCH_MEM                   : ORIGIN = 0xCC000000 , LENGTH = 0x14000000
    7. Update the linker.cmd in the same directory to point to the newly created section as shown below
          .bss:ddr_shared_mem     (NOLOAD) : {} > C7X_SCRATCH_MEM
          .bss:ddr_scratch_mem    (NOLOAD) : {} > C7X_SCRATCH_MEM
    8. Navigate to /home/a0393891local/Work/PSDRA_6_1/psdk_rtos_auto_j7_06_01_00_15/vision_apps/apps/basic_demos/app_tirtos/common and in the file app_cfg_c7x_1.h update the sized of the below macros
      #define DDR_HEAP_MEM_SIZE     ((256)*0x100000u) //earlier 80MB
      #define DDR_SCRATCH_SIZE      ((48)*0x100000u) // earlier 16MB
    Now build the vision_apps and transfer the binaries to SD card as per standard steps.
    To confirm the changes have taken effect, while linux boots the newly carved out section should appear in the beginning of boot log.
    [    0.000000] OF: reserved mem: initialized node vision_apps-dma-memory@b8000000, compatible id shared-dma-pool
    [    0.000000] Reserved memory: created DMA memory pool at 0x00000000bc000000, size 256 MiB
    [    0.000000] OF: reserved mem: initialized node vision_apps_shared-memories@bc000000, compatible id shared-dma-pool
    [    0.000000] Reserved memory: created DMA memory pool at 0x00000000cc000000, size 320 MiB
    [    0.000000] OF: reserved mem: initialized node vision_apps_shared-memories_1@cc000000, compatible id shared-dma-pool
    Login and run the vision_apps_init
    /opt/vision_apps> ./vision_apps_init.sh
    This should execute without any errors and all core IPC handshake is complete
    Run one of the existing AVP/TIDL demos to confirm the changes do not affect the existing demos.
    Next try with your custom model/custom application.
    We've seen this to work, please try and let us know.
    Regards,
    Shyam