This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: app_tidl based

Part Number: TDA4VM

Tool/software:

In TDA4, for  TIDL use case applications, 
1. Do we have control to change / schedule the  tasks running on "C7x"  meant for MMA processing
2. If yes, pls point out the source code files under SDK and the related docs if any 

3. when i run app_tidl_od  for the yuv files included in SDK,  i observe the detection timing is in range of 100ms to 500ms or more.   As per TI, what is the average time taken for detecting the vehicle, lets say, to be used under FCW (Forward collision warning) using tidl_od model supplied in SDK. 

  • Hi Komal,

    1.  It is best to let TIDL handle the C7x/MMA scheduling.  It is possible to change this but we do not recommend nor support this.

    2. We do not recommend doing this, so there is very limited documentation and we release MMA only in binary form (no source code)

    3. This all depends on the model, device, and other processes running on the device.  There is a set of published benchmarks in the model zoo for reference.

    https://github.com/TexasInstruments/edgeai-tensorlab/tree/main/edgeai-modelzoo

    Regards,

    Chris

  • Thank you for your response on all those points.
    My Vision models are MTCNN based, which contains multiple  stages (3 trained models) connected  one after other for a particular usecase.  So in this context, if  I dont have control to connect them on DSP side, I think there will be overhead to come back to A72 and move back to DSP again. and also there can be memory copies invovled?   Can I seek your suggestion to handle it optimally in this case?

  • Hi Komal,

    I cannot comment specifically on how your model is architected.  But you are correct about running some layers on the A cores will be slower and context switches do have some cost.  You can minimize the number of context switches with the "deny list" in the configuration.  Just set a block of layers you want on the A core in this list instead of individual disparate layers.  The memory part is not as much of an issue as the transitions share common memory so the different cores just get a pointer to the memory so memory copies are minimized.

    The key here is to strategically use the deny list to minimize fragmentation of the model and context switches.

    Regards,

    Chris

  • Thank you for your response. 
    Need few more clarifiation.
    1. How much totlal  memory  that is available for  running Vision models  on C7x<->MMA?
    2.  > The key here is to strategically use the deny list to minimize fragmentation of the model and context switches.
    Can you please elaborate  on this or point out any document/code to understand ?

  • The engineer responsible is currently out of office. Please expect delay in response until next week.

    We appreciate your patience.

    Warm regards,
    Christina

  • Thank you for the update.
    As this matter is pending, I would appreciate it if you could ensure a response is prioritized once the engineer is back.
    Looking forward to a resolution at the earliest.

  • Hi Komal,

    Memory available is device dependent, it is different for each device variant.  Please review the TRM for the device in your application for available memory.  As far as the deny_list, here are the OSRT instructions.   You can deny a class of layers, say MaxPool, and all MaxPool layers will run on the ARM.  Or you can be specific of a specific layer, say, /model/backbone/some_other_path_element/MaxPool_0, and only that MaxPool layer will run on the ARM core.

    Following options force offload of a particular layer to TIDL DSP/ARM. These can be exercised either for debug purpose, or performance improvement by creating optimal cluster in case desired

    Name Description Supported values/range Option Type Additional details
    deny_list:layer_type This option forcefully disables offload of a particular operator to TIDL DSP using layer type Comma separated string Model Compilation This option is not available currently for TVM, please refer deny_list option.
    deny_list:layer_name This option forcefully disables offload of a particular operator to TIDL DSP using layer name Comma separated string Model Compilation This option is not available currently for TVM, please refer deny_list option
    deny_list This option offers same functionality as deny_list:layer_type Comma separated string Model Compilation Maintained for backward compatibility, not recommended for Tflite/ONNX runtime
    allow_list:layer_name This option forcefully enables offload of a particular operator to TIDL DSP using layer name Comma separated string Model Compilation Only the layer/layers specified are accelerated, others are delegated to ARM. Experimental for Tflite/ONNX runtime and currently not applicable for TVM

    Note : Allow_list and deny_list options cannot be enabled simultaneously

    Examples of usage:
    Specifying layer_type as part of options:

    • Tflite runtime : Specify registration code as specified in tflite builtin ops - Please refer Tflite builtin ops , e.g. 'deny_list:layer_type':'1, 2' to deny offloading 'AveragePool2d' and 'Concatenation' operators to TIDL.
    • ONNX runtime : Specify the ONNX operator name e.g. "MaxPool" to deny offloading Max pooling operator to TIDL
    • TVM runtime : Specify TVM relay operator name e.g. "nn.conv2d" to deny offloading convolution operator to TIDL

    Specifying layer_name as part of options:

    • Specify the layer name as observed in Netron for the layer
    • For ONNX models, layer name may not present as part of layer in models in some cases; in such cases, output name corresponding to output(0) for the particular layer can be specified as part of 'deny_list:layer_name'/'allow_list:layer_name' options

    Regards,

    Chris