This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMDSEVM572X: TIDL jDetNet on single DSP

Part Number: TMDSEVM572X
Other Parts Discussed in Thread: AM5728

Hi all,

I have sucessfully ran some example TIDL code on the AM572x EVM board. Mainly Cifar classifier and jDetNet SSD with the python api.

Everything works fine when using EVE+DSP for jdetnet, but our actual target hardware (AM5728) only has one DSP.

I tried various CMEM sizes (up to 512mb) and verified with the opencl platforms example.

But: The memory allocation for the single DSP case still fails. Doesnt increasing CMEM size adress this issue? If I am mistaken, please provide an explanation - I would like to learn.

Here is some code excerpt:

config = tidl.Configuration()
config.read_from_file(configFile)
# prints: "Network needs 64.0 + 9.0 mb heap"
print("Network needs",config.network_heap_size/1024**2,"+",config.param_heap_size/1024**2,"mb heap")

# set all layers to group 1
config.layer_index_to_layer_group_id = {i:1 for i in range(43)}

# fails with: "TidlError: TIDL Error: [src/execution_object.cpp, Wait, 617]: Memory allocation failed on device"
dsp = tidl.Executor(tidl.DeviceType.DSP, set([tidl.DeviceId.ID0]),config,1)

  • Hi Niklas,

    Can you try setting the showHeapStats flag?  Check here for example output:

    http://downloads.ti.com/mctools/esd/docs/tidl-api/using_api.html#sizing-device-side-heaps

    Regards,
    Mike

  • Hi Mike,

    thanks for the link! I tried as outlined, with the following result:

    config = tidl.Configuration()
    config.read_from_file(configFile)
    config.show_heap_stats = True
    config.param_heap_size = 2972496 # derived from TIDL output
    config.network_heap_size = 128*1024**2 # still a guess
    
    config.layer_index_to_layer_group_id = {i:1 for i in range(43)}
    dsp = tidl.Executor(tidl.DeviceType.DSP, set([tidl.DeviceId.ID0]),config,1)

    Output:

    dsp = tidl.Executor(tidl.DeviceType.DSP, set([tidl.DeviceId.ID0]),config,1)
    Traceback (most recent call last):
    
      File "<ipython-input-10-7b50316b580f>", line 1, in <module>
        dsp = tidl.Executor(tidl.DeviceType.DSP, set([tidl.DeviceId.ID0]),config,1)
    
    TidlError: TIDL Error: [src/execution_object.cpp, Wait, 617]: Memory allocation failed on device
    
    [eve 0]         TIDL Device Trace: PARAM heap: Size 2972496, Free 0, Total requested 2972496

    Note that there is no message concerning the NETWORK heap size, as seen in the api doc. When trying a smaller network, both messages appear correctly.

    Any ideas?

    Bests,

    Niklas

  • Any updates regarding this issue?

  • Hi Niklas,

    Thank you for your patience - the engineering that normally answers TIDL questions on E2E is on vacation.

    I have reached out to the TIDL development team on your issue.

    In the meantime, can verify the CMEM driver is loading properly?  Can you respond back with the output from this command:

    cat /proc/cmem

    Regards,
    Mike

  • Hi Mike,

    thanks for your response.

    Here is the output:

    root@am57xx-evm:~# cat /proc/cmem
    
    Block 0: Pool 0: 1 bufs size 0x20000000 (0x20000000 requested)
    
    Pool 0 busy bufs:
    id 0: phys addr 0xa0000000 (cached)
    
    Pool 0 free bufs:
    

    In this case, I am running with 512mb CMEM.

    Bests,

    Niklas

  • Hi Niklas,

    Our TIDL forum expert will be back in the office on Monday and should be able to get to the bottom of your issue.

    In the meantime, check out this link: https://downloads.ti.com/mctools/esd/docs/tidl-api/using_api.html?#sizing-device-side-heaps

    There are several examples that come with the TI Linux SDK shown.  If you pass the "-v" option, you will get a verbose output from the demo app, showing the memory allocation.  It would be a good test to try one of the known working demo applications with your CMEM configuration to determine if there is something missing in your Python code, or there is a CMEM configuration issue.

    Regards,
    Mike

  • Hi Mike,

    I tried several examples:

    root@am57xx-evm:/usr/share/ti/tidl/examples/imagenet# ./imagenet -d 1 -e 0
    Input: ../test/testvecs/input/objects/cat-pet-animal-domestic-104827.jpeg
    1: tabby,   prob = 52.55%
    2: Egyptian_cat,   prob = 21.18%
    3: tiger_cat,   prob = 17.65%
    Loop total time (including read/write/opencv/print/etc):  209.6ms
    imagenet PASSED
    
    ./mnist -d 1 -e 0
    Input images: ../test/testvecs/input/digits10_images_28x28.y
    Input labels: ../test/testvecs/input/digits10_labels_10x1.y
    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    Device total time:  45.05ms
    Loop total time (including read/write/print/etc):  46.19ms
    Accuracy:    100%
    mnist PASSED
    

    Note, that I explicitly set the amount of DSPs to 1 and EVEs to 0 to verify that a single DSP is able to run the sample networks.

    The "ssd_multibox" example works as well, but not out of the box on DSP=1 and EVE=0.

    However, the "classification" example fails:

    root@am57xx-evm:/usr/share/ti/tidl/examples/classification# ./tidl_classification   
    tidl FAILED
    root@am57xx-evm:/usr/share/ti/tidl/examples/classification# ./tidl_classification -d 2 -e 2
    tidl FAILED
    root@am57xx-evm:/usr/share/ti/tidl/examples/classification# ./tidl_classification -v       
    tidl FAILED
    

    Bests

    Niklas

  • Hi Niklas,

    Which version of Processor SDK Linux are you using? There was bug in earlier versions of the SDK for which the JDetNet model was failing to run on DSP alone. This issue was addressed in Processor SDK version 6.0. Let me know if you are saying that SDK 6.0 is also failing to run the model on DSP alone. 

    Regards,

    Manisha

  • Hi Manisha,

    thanks for your reply. We are using SDK v. 6.0.

    Other networks, like the mobilenet 0.5 detection framework work just fine.

    Bests

    Niklas

  • Hi Niklas,

    Apologize for my delayed response. I would like to correct my prior response. The bug fix will be available in Processor SDK 6.1, scheduled to release in end of 3Q19.