AM62A3: Execution of compiled models on AM62A board is failing

Stefan Werner

Part Number: AM62A3

Tool/software:

Hi,

I have used edgeai-tidl-tools to compile the artifacts in the examples/osrt_python section (both ort & tfl). Then I've copied the artifacts (models/ and model-artifacts/) to the devboard as outlined in the readme section.

Once I execute the inference scripts (examples/ort/onnxrt_ep.py and the tflite script) on the AM62A board, I get a segmentation fault.

Running 4 Models - ['cl-tfl-mobilenet_v1_1.0_224', 'ss-tfl-deeplabv3_mnv2_ade20k_float', 'od-tfl-ssd_mobilenet_v2_300_float', 'od-tfl-ssdlite_mobiledet_dsp_320x320_coco']


Running_Model :  cl-tfl-mobilenet_v1_1.0_224

 Number of subgraphs:1 , 34 nodes delegated out of 34 nodes

APP: Init ... !!!
MEM: Init ... !!!
MEM: Initialized DMA HEAP (fd=6) !!!
MEM: Init ... Done !!!
IPC: Init ... !!!
IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
5212122.232075 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
5212122.232439 s:  VX_ZONE_INIT:Enabled
5212122.232479 s:  VX_ZONE_ERROR:Enabled
5212122.232637 s:  VX_ZONE_WARNING:Enabled
5212122.234513 s:  VX_ZONE_INIT:[tivxInitLocal:130] Initialization Done !!!
5212122.235476 s:  VX_ZONE_INIT:[tivxHostInitLocal:101] Initialization Done for HOST !!!
TIDL_RT_OVX: ERROR: Config file size (94616 bytes) does not match size of sTIDL_IOBufDesc_t (37272 bytes)
5212122.237739 s:  VX_ZONE_ERROR:[tivxAddKernelTIDL:269] invalid values for num_input_tensors or num_output_tensors
5212122.249661 s:  VX_ZONE_ERROR:[vxQueryKernel:137] Invalid kernel reference
5212122.249717 s:  VX_ZONE_ERROR:[vxMapUserDataObject:456] Invalid user data object reference
Segmentation fault (core dumped)

What is the issue here? Note that I can run python3 tflrt_delegate.py -d , for example, which is the mode where it doesnt offload computation, right?

Some other questions:

- How do I find out what SDK version of edgeai-tidl-tools I need to compile the models? Ive found this table: edgeai-tidl-tools/docs/version_compatibility_table.md at master · TexasInstruments/edgeai-tidl-tools (github.com) but could not figure out how to find the SDK version installed on my AM62A devboard.

- Why are there no pre-build docker images? Setting all this up on our own a bit confusing and tedious.

over 1 year ago

0 Reese Grimsley over 1 year ago

TI__Genius 14976 points

Hello Stefan,

I see the issue with your model -- it is indicating there is a SDK mismatch between what the model was compiled and what is actually running

>TIDL_RT_OVX: ERROR: Config file size (94616 bytes) does not match size of sTIDL_IOBufDesc_t (37272 bytes)

You can check which SDK on the target EVM with:

echo $EDGEAI_SDK_VERSION

This won't tell you the last number (bugfix version), but you should be able to tell based on the SDK download. If you only downloaded/installed the .WIC image, then it should be the SDK version closed to the $EDGEAI_SDK_VERISON. The error is saying there's a mismatch in a struct size -- I know this struct has only grown since AM62A released, so I would guess that your installation is version 9.1 or older. I believe you must have compiled your model with the latest tools (10.0.2.0 or 10.0.4.0.. both are compatible with the recently released 10.0.0.8 SDK) based on that config file size

Stefan Werner said:
- Why are there no pre-build docker images? Setting all this up on our own a bit confusing and tedious.

I assume the keyword here is 'prebuilt', right? We have docker images, but they do require some building/setup.

- https://github.com/TexasInstruments/edgeai-tidl-tools/tree/master/dockers

As for why they aren't prebuilt, that's a fair question. I'll ask for some input here from development team responsible for that side of the tooling and respond back.

BR,
Reese

0 Stefan Werner over 1 year ago in reply to Reese Grimsley

Prodigy 30 points

Hello Reese,

Thanks for the response. The keyword is indeed "prebuild". Especially when dealing with company proxies that dont allow downloading some arbirary tarfiles, the setup can be tedious. It would be great to have some docker containers on dockerhub to get started quickly. The same thing goes for the tidl integrations of mmdetection / yolox / .... , as it is a bit unclear how to install their dependencies into a single container (for example, the 9.0.0 sdk version of edgeail-tidl-tools is based on python3.10 while the respective mmdetection tag reads that it was tested with python3.7).

This was indeed the issue. I've re-build the docker containers for the correct tags and can now run the inference on my devboard. I still have some issues with getting the other examples to work (e.g. jupyter, see here: (+) AM62A3: AM62A: [edgeai-tidl-tools] Execution of Jupyter examples gives a segmentation fault. - Processors forum - Processors - TI E2E support forums ). It would be great to get some support for this as well.

Best Regards

Stefan

0 Reese Grimsley over 1 year ago in reply to Stefan Werner

TI__Genius 14976 points

Hi Stefan,

I understand. We too have some challenges with company proxies, in particular dockerhub. Let me deliver this feedback to our backend teams - I think it is quite relevant. Tools setup can be quite frustrating, especially with immovable barriers like IT policies..

Stefan Werner said:
(for example, the 9.0.0 sdk version of edgeail-tidl-tools is based on python3.10 while the respective mmdetection tag reads that it was tested with python3.7)

Also understood. It seems our training tools migrate python versions less frequently than our SDK's. It sounds like like you were able to resolve this, yes? Were there dependencies you had to manage manually to resolve version conflicts?

I expect the training tools are better able to migrate/move between python versions since they do not (if memory serves) come based on wheels that are generated for specific python versions. The standalone model compilation tools under edgeai-tidl-tools do have this quality (Python wheels compiled against a specific python version)

Also I want to ensure your using the correct mmdetection and similar repos; those have been merged into a single edgeai-tensorlab repo, which contains previously separate repos inside. E.g.., our mmdetection fork is now here: https://github.com/TexasInstruments/edgeai-tensorlab/tree/main/edgeai-mmdetection

Stefan Werner said:
I still have some issues with getting the other examples to work (e.g. jupyter, see here: (+) AM62A3: AM62A: [edgeai-tidl-tools] Execution of Jupyter examples gives a segmentation fault. - Processors forum - Processors - TI E2E support forums ). It would be great to get some support for this as well.

I'm looking at that thread next, standby :)

BR,
Reese