TDA4VM: app_tidl based

komal k s

Part Number: TDA4VM

Tool/software:

In TDA4, for TIDL use case applications,
1. Do we have control to change / schedule the tasks running on "C7x" meant for MMA processing
2. If yes, pls point out the source code files under SDK and the related docs if any

3. when i run app_tidl_od for the yuv files included in SDK, i observe the detection timing is in range of 100ms to 500ms or more. As per TI, what is the average time taken for detecting the vehicle, lets say, to be used under FCW (Forward collision warning) using tidl_od model supplied in SDK.

2 months ago

0 Chris Tsongas 2 months ago

TI__Genius 14420 points

Hi Komal,

1. It is best to let TIDL handle the C7x/MMA scheduling. It is possible to change this but we do not recommend nor support this.

2. We do not recommend doing this, so there is very limited documentation and we release MMA only in binary form (no source code)

3. This all depends on the model, device, and other processes running on the device. There is a set of published benchmarks in the model zoo for reference.

https://github.com/TexasInstruments/edgeai-tensorlab/tree/main/edgeai-modelzoo

Regards,

Chris

0 komal k s 2 months ago in reply to Chris Tsongas

Prodigy 150 points

Thank you for your response on all those points.
My Vision models are MTCNN based, which contains multiple stages (3 trained models) connected one after other for a particular usecase. So in this context, if I dont have control to connect them on DSP side, I think there will be overhead to come back to A72 and move back to DSP again. and also there can be memory copies invovled? Can I seek your suggestion to handle it optimally in this case?

0 Chris Tsongas 2 months ago in reply to komal k s

TI__Genius 14420 points

Hi Komal,

I cannot comment specifically on how your model is architected. But you are correct about running some layers on the A cores will be slower and context switches do have some cost. You can minimize the number of context switches with the "deny list" in the configuration. Just set a block of layers you want on the A core in this list instead of individual disparate layers. The memory part is not as much of an issue as the transitions share common memory so the different cores just get a pointer to the memory so memory copies are minimized.

The key here is to strategically use the deny list to minimize fragmentation of the model and context switches.

Regards,

Chris

0 komal k s 2 months ago in reply to Chris Tsongas

Prodigy 150 points

Thank you for your response.
Need few more clarifiation.
1. How much totlal memory that is available for running Vision models on C7x<->MMA?
2. > The key here is to strategically use the deny list to minimize fragmentation of the model and context switches.
Can you please elaborate on this or point out any document/code to understand ?

0 Christina Kuruvilla 2 months ago in reply to komal k s

TI__Expert 4430 points

The engineer responsible is currently out of office. Please expect delay in response until next week.

We appreciate your patience.

Warm regards,
Christina

0 komal k s 1 month ago in reply to Christina Kuruvilla

Prodigy 150 points

Thank you for the update.
As this matter is pending, I would appreciate it if you could ensure a response is prioritized once the engineer is back.
Looking forward to a resolution at the earliest.

0 Chris Tsongas 1 month ago in reply to komal k s

TI__Genius 14420 points

Hi Komal,

Memory available is device dependent, it is different for each device variant. Please review the TRM for the device in your application for available memory. As far as the deny_list, here are the OSRT instructions. You can deny a class of layers, say MaxPool, and all MaxPool layers will run on the ARM. Or you can be specific of a specific layer, say, /model/backbone/some_other_path_element/MaxPool_0, and only that MaxPool layer will run on the ARM core.

Following options force offload of a particular layer to TIDL DSP/ARM. These can be exercised either for debug purpose, or performance improvement by creating optimal cluster in case desired

Name	Description	Supported values/range	Option Type	Additional details
deny_list:layer_type	This option forcefully disables offload of a particular operator to TIDL DSP using layer type	Comma separated string	Model Compilation	This option is not available currently for TVM, please refer deny_list option.
deny_list:layer_name	This option forcefully disables offload of a particular operator to TIDL DSP using layer name	Comma separated string	Model Compilation	This option is not available currently for TVM, please refer deny_list option
deny_list	This option offers same functionality as deny_list:layer_type	Comma separated string	Model Compilation	Maintained for backward compatibility, not recommended for Tflite/ONNX runtime
allow_list:layer_name	This option forcefully enables offload of a particular operator to TIDL DSP using layer name	Comma separated string	Model Compilation	Only the layer/layers specified are accelerated, others are delegated to ARM. Experimental for Tflite/ONNX runtime and currently not applicable for TVM

Note : Allow_list and deny_list options cannot be enabled simultaneously

Examples of usage:
Specifying layer_type as part of options:

Tflite runtime : Specify registration code as specified in tflite builtin ops - Please refer Tflite builtin ops , e.g. 'deny_list:layer_type':'1, 2' to deny offloading 'AveragePool2d' and 'Concatenation' operators to TIDL.
ONNX runtime : Specify the ONNX operator name e.g. "MaxPool" to deny offloading Max pooling operator to TIDL
TVM runtime : Specify TVM relay operator name e.g. "nn.conv2d" to deny offloading convolution operator to TIDL

Specifying layer_name as part of options:

Specify the layer name as observed in Netron for the layer
For ONNX models, layer name may not present as part of layer in models in some cases; in such cases, output name corresponding to output(0) for the particular layer can be specified as part of 'deny_list:layer_name'/'allow_list:layer_name' options

Regards,

Chris

Processors

Processors forum

TDA4VM: app_tidl based