PROCESSOR-SDK-AM62A: Deeplabv3_resnet101 not compiling

Abhy K S

Hi TI Experts,

I am trying to compile the original DeeplabV3_resnet101 from pytorch after converting it to onnx format with opset version 11 and getting the error in the attached log. Could you please help out in solving the same.

Regards,

-Abhy

Fullscreen DeeplabV3_v11_compile_log.txt Download

abhy@JPN-1CZ247006Z:~/edgeai-tidl-tools/examples/osrt_python/ort$ python onnxrt_ep.py -c
Available execution providers :  ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']

Running 1 Models - ['ss-ort-deeplabv3_v11']


Running_Model :  ss-ort-deeplabv3_v11  


Running shape inference on model ../../../models/public/DeeplabV3_v11.onnx 

Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/abhy/edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py", line 190, in run_model
    sess = rt.InferenceSession(config['model_path'] ,providers=EP_list, provider_options=[delegate_options, {}], sess_options=so)
  File "/home/abhy/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 283, in __init__
    self._create_inference_session(providers, provider_options)
  File "/home/abhy/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 310, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from ../../../models/public/DeeplabV3_v11.onnx failed:/home/a0496663/work/edgeaitidltools/rel90/onnx/onnxruntime_bit/onnxruntime/onnxruntime/core/graph/model.cc:111 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&) Unknown model file format version.

over 2 years ago

0 Reese Grimsley over 2 years ago

TI__Mastermind 18426 points

Hi Abhy,

May I ask where you got this Deeplabv3 from? I haven't seen any of our optimized models do this. For ONNX models, we support opsets 9 and 11. My first recommendation would be to open the model in ONNX (completely separate from anything TI related) to make sure the export process went well. It looks like shape_inference worked, so I should think the model is correctly formatted.

Viewing in Netron might also tell if there was an export issue or not, but may not be particularly useful for diagnosing an issue.

If you are willing to post the model filehere, I can give more support.

Best,
Reese

0 Abhy K S over 2 years ago in reply to Reese Grimsley

Prodigy 20 points

Hi Reese,

I am terribly sorry for the confusion. I have shared the wrong log file. Attached is the log file that I am getting when executing the deeplabV3_resnet101 from pytorch which is converted to onnx with opset version 11. Could you please help me out here, on the process where I need to focus to correct this error.

Regards,

Abhy K S

Fullscreen deeplabv3_resnet101_coco_Q_log.txt Download

deeplabv3_resnet101_coco_Q


koseki@JPN-1CZ247004K:~/edgeai-tidl-tools/examples/osrt_python/ort$ python onnxrt_ep.py -c
Available execution providers :  ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']

Running 1 Models - ['ss-ort-deeplabv3_resnet101_coco_Q']


Running_Model :  ss-ort-deeplabv3_resnet101_coco_Q  


Running shape inference on model ../../../models/public/deeplabv3_resnet101_coco_Q.onnx 

Error , Node 241 : SIZES Input Tensor is not supported for Resize-11 operator
                As a Workaround user can provide the resize factor using SCALES tensor 
                 instead of using SIZES. As an example instead of using 
                 interpolate(x, size=[W, H], mode='bilinear', align_corners=False)
                 user can use 
                 interpolate(x, scale_factor=(s1,s2), mode='bilinear', align_corners=False) 
                 where s1 and s2 are ratio of resize factor for width and height.

Resize layer delegated to ARM -- '/classifier/classifier.0/convs.4/Resize' 
Error , Node 256 : SIZES Input Tensor is not supported for Resize-11 operator
                As a Workaround user can provide the resize factor using SCALES tensor 
                 instead of using SIZES. As an example instead of using 
                 interpolate(x, size=[W, H], mode='bilinear', align_corners=False)
                 user can use 
                 interpolate(x, scale_factor=(s1,s2), mode='bilinear', align_corners=False) 
                 where s1 and s2 are ratio of resize factor for width and height.

Resize layer delegated to ARM -- '/Resize' 

Preliminary subgraphs created = 2 
Final number of subgraphs created are : 2, - Offloaded Nodes - 254, Total Nodes - 257 
2023-10-24 13:44:11.091633922 [E:onnxruntime:, inference_session.cc:1311 operator()] Exception during initialization: std::bad_alloc
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/koseki/edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py", line 190, in run_model
    sess = rt.InferenceSession(config['model_path'] ,providers=EP_list, provider_options=[delegate_options, {}], sess_options=so)
  File "/home/koseki/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 283, in __init__
    self._create_inference_session(providers, provider_options)
  File "/home/koseki/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 315, in _create_inference_session
    sess.initialize_session(providers, provider_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: std::bad_alloc

0 Reese Grimsley over 2 years ago in reply to Abhy K S

TI__Mastermind 18426 points

Hello Abhy,

It looks like the Resize nodes are not misconfigured. Our runtime is saying that the resize layers are specifying the resolution to scale to, but they should be specifying a scaling factor instead. This can be modified without retraining -- either by loading the onnx model in python, making changes to the offending resize layer, and saving the model, OR by using a tool like ONNX-modifier. I have gone through this process before manually and can confirm it can work for fixing these issues with resize nodes.

As for the std::bad_alloc message, this may be resulting from the resize nodes being detected as requiring arm offload. You can add that layer type to the deny-list and see if that resolves the issue here.

It would also help to compile with debug_level=2 set in the compile options (see common_utils.py in the osrt_python directory). This is also the file where you can set the deny-list layers. This will give more detailed output and give clues on where this error occurred within the compilation functions.

Best,
Reese

0 Abhy K S over 2 years ago in reply to Reese Grimsley

Prodigy 20 points

Hi Reese,

I am extremely grateful to you for the support you are providing.

I was able to compile a different deeplabv3_resnet101 model, but on deploying the model the output comes like what is shown in the pic. Could you please help me in sorting it out. The model I have used is available here: https://drive.google.com/file/d/1LMFLEw-3bfT67cXxsfenYMg9yX7Scjmf/view?usp=sharing

0 Reese Grimsley over 2 years ago in reply to Abhy K S

TI__Mastermind 18426 points

Hi Abhy,

Hmm, that's an odd output, especially because there is a pattern in the data (all rows are classified as the same). However, the model looks strange to me. What I see is a huge (170 MB onnx file!) encoder / classification network... I'm going to guess resnet101 from earlier comments. The input is 224x224 and the output is 1x1000. I'm surprised this ran as a segmentation model in edgeai-gst-apps, but it explains why the output includes horizontal bars, since each of those vectors would be the same.

For this model, can you add back in the segmentation head? Without that head, the model is behavior (roughly) as I would expect. Note that the previous errors were with the resize/upsample/deconf layers, where they were using a size vector instead of scale. Please be aware of this when creating the full segmentation model.

For sharing models in the future, it helps me to provide the artifacts too, so I can better replicate your use case on my side :)

I will mention that resnet101 is a very large encoder. You can probably suffice on a smaller encoder like resnet50 if you are open to changing. The C7x_1 bar in the image is showing the accelerator is 44% utilized at 18fps, and that's without the segmentation head, which is also a compute-heavy set of layers.

Best,
Reese

0 Abhy K S over 2 years ago in reply to Reese Grimsley

Prodigy 20 points

Hi Reese,

I tried several times in editing the Resize layer but it is not paying off. Could you please help med doing the same. Here is the link to the model: drive.google.com/.../view

0 Reese Grimsley over 2 years ago in reply to Abhy K S

TI__Mastermind 18426 points

Hello Abhy,

Could you please describe what the issues are in this effort?

I opened the model and I notice that the tensors for "scales" is not populated.

When I had made modifications for this in the past, I followed this procedure:

Load model in onnx-modifier graphical tool for editing
Create new Resize layer for every Resize layer that throws this size vs. scales error.
Modify the new resize layers to take the place of old ones. This requires naming the inputs and outputs of the new layer to be the same as the one it is replacing.
1. save a copy! I will note that iterative modifications (replace a node, save, reload, replace a node, etc.) threw issues due to some of the new node's default names. I found it best to add all nodes and fix connections in one go, and save the model before modifying parameters
modify resize node parameters to most closely match the old nodes' parameters. Setting tensors may be problematic due to an error with the tool
1. For any tensors or parameters not easily set within the graphical tool, I used the onnx (not onnxruntime) package in python to make further modifications
Load the model in ONNX
Using the model.graph.node to list out all the nodes, find the resize nodes
1. for each of the resize nodes, set the parameters (strings, tensors, etc.) that could not be edited before here.
Resave the model
recompile the model
Try on the target.

I hope this helps! This is an ad hoc process, but I found good results from approaching it this way. It would be similarly possible to develop some python scripts that would handle all of this automatically.

Best,
Reese

0 Abhy K S over 2 years ago in reply to Reese Grimsley

Prodigy 20 points

Hi Reese,

I removed the resize input tensors and still the model is not compiling. Please refer the attached log filedeeplabv3_test15_log_2.txt

0 Reese Grimsley over 2 years ago in reply to Abhy K S

TI__Mastermind 18426 points

Hi Abhy,
Is there a runtimes_visualization.svg file located in the artifacts directory? You can post the model artifacts directory here, and that would help - this doesn't have to be on google Drive, by the way :)

Looking at the log, It seems that the auto-generated filename for the subgraph may be too long. Those subgraph filenames are generated based on the set of output layers. It looks like there are multiple that cause. Does that file name exist in the model-artifacts/deeplabv3-test15/tempDir directory?

File would be named: _classifier_classifier.0_convs.4_convs.4.3_Relu_output_0_classifier_classifier.0_convs.3_convs.3.2_Relu_output_0_classifier_classifier.0_convs.2_convs.2.2_Relu_output_0_classifier_classifier.0_convs.1_convs.1.2_Relu_output_0_classifier_classifier.0_convs.0_convs.0.2_Relu_output_0_tidl_net.bin
- this is based on each of the output names for the subgraph. I have seen cases before where too long of a name caused an error. That may be happening here.
- There are 5 outputs of this subgraph, which is making the name very long

To solve this issue for now, I would suggest adding a specific layer to the deny_list -- not an entire layer type like Resize, but a single instance of a layer. The option should be called "deny_list:layer_name".

This is the area of the network where the problems come from:

With the current denylist for Resize, it is making a subgraph whose boundary is after the bottom left 3 RELu's, the top RELU, and the RELU in the right-hand path. The 4th Conv from the left is being denied, which is due to the very large dilations and padding in this ASPP section of the deeplabv3 head.

You might try adding the ReLU at the top here to the deny_list. You may need to try a few different deny_list configurations here to find a version that avoids this issue. There will be a performance penalty here when running that convolution and these resize nodes on the CPU.

The best solution in general will be to fix the Resize nodes. That one convolution layer will still be offloaded to the Arm CPU, though. Otherwise, let me peek into the import tool's source code to see if there is a limit on filename size.

Unrelated topic, can I ask why you chose this resnet101 backbone with deeplabv3 head? We had a deeplabv3 model with mobilenetv2 backbone that's already been optimized to run well on the SoC. That has ~35 fps for 512x512 input resolution. I am concerned resnet101 is too heavy as an encoder network.

Best regards,

Reese

0 Abhy K S over 2 years ago in reply to Reese Grimsley

Prodigy 20 points

Hi Reese,

Let me first convey my heartfelt thanks to you for the effort you are taking in supporting me.

The reason I chose resnet101 backbone with deeplabv3 head is because, I am benchmarking the performance of this model on the TI-AM62A board.

I am using the pretrained deeplabv3_resnet101 model available in the torchvision library, as such and converting the same to onnx to compile and deploy to the board.

I have changed the sizes parameter to scales inside the interpolate function of the architecture and had obtained a new model, which does not have the "sizes" input tensor inside the Resize node and the model got compiled (Model artifacts attached).

But now, the inference I am getting from the board is not showing any segmentation (picture and logs attached for reference). Could you please help me out in figuring what could be the issue now.

Regards,x

Abhy

Fullscreen deeplabv3_test50_execute_log.txt Download

root@am62axx-evm:/opt/edgeai-gst-apps/apps_python# ./app_edgeai.py ../configs/sem
mantic_segmentation.yaml
libtidl_onnxrt_EP loaded 0x4131b5d0
Final number of subgraphs created are : 3, - Offloaded Nodes - 257, Total Nodes - 261
APP: Init ... !!!
MEM: Init ... !!!
MEM: Initialized DMA HEAP (fd=5) !!!
MEM: Init ... Done !!!
IPC: Init ... !!!
IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
   719.406071 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
   719.408565 s:  VX_ZONE_INIT:Enabled
   719.408609 s:  VX_ZONE_ERROR:Enabled
   719.408620 s:  VX_ZONE_WARNING:Enabled
   719.410338 s:  VX_ZONE_INIT:[tivxInitLocal:130] Initialization Done !!!
   719.412146 s:  VX_ZONE_INIT:[tivxHostInitLocal:101] Initialization Done for HOST !!!
   719.684840 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
   719.684887 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
   719.684928 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
   719.684941 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
   719.684953 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
   719.684969 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
   719.684988 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
   719.685000 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
TIDL_RT_OVX: ERROR: Verifying TIDL graph ... Failed !!!
TIDL_RT_OVX: ERROR: Verify OpenVX graph failed
==========[INPUT PIPELINE(S)]==========

[PIPE-0]

v4l2src device=/dev/video-usb-cam0 brightness=133 contrast=5 saturation=83 ! capsfilter caps="image/jpeg, width=(int)1280, height=(int)720;" ! jpegdec ! tiovxdlcolorconvert ! capsfilter caps="video/x-raw, format=(string)NV12;" ! tiovxmultiscaler name=split_01
split_01. ! queue ! capsfilter caps="video/x-raw, width=(int)1280, height=(int)720;" ! tiovxdlcolorconvert out-pool-size=4 ! capsfilter caps="video/x-raw, format=(string)RGB;" ! appsink max-buffers=2 drop=True name=sen_0
split_01. ! queue ! capsfilter caps="video/x-raw, width=(int)752, height=(int)472;" ! tiovxmultiscaler target=1 ! capsfilter caps="video/x-raw, width=(int)224, height=(int)224;" ! tiovxdlpreproc out-pool-size=4 ! capsfilter caps="application/x-tensor-tiovx;" ! appsink max-buffers=2 drop=True name=pre_0


==========[OUTPUT PIPELINE]==========

appsrc do-timestamp=True format=3 block=True name=post_0 ! tiovxdlcolorconvert ! capsfilter caps="video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720;" ! queue ! mosaic_0.sink_0

tiovxmosaic target=1 background=/tmp/background_0 name=mosaic_0 src::pool-size=4
sink_0::startx="<320>" sink_0::starty="<150>" sink_0::widths="<1280>" sink_0::heights="<720>"
! capsfilter caps="video/x-raw, format=(string)NV12, width=(int)1920, height=(int)1080;" ! queue ! tiperfoverlay title=Semantic Segmentation ! kmssink sync=False max-lateness=5000000 qos=True processing-deadline=15000000 driver-name=tidss connector-id=40 plane-id=31 force-modesetting=True



 +--------------------------------------------------------------------------+

 | Semantic Segmentation                                                    |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | Input Src: /dev/video-usb-cam0                                           |

 | Model Name: ONR-SS-deeplabv3_test50                                      |

 | Model Type: segmentation                                                 |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+   721.629546 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
   721.629594 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
   721.629634 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
   721.629646 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
   721.629658 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
   721.629673 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
   721.629690 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
   721.629701 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
   721.630055 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
   721.630068 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
   721.630079 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!


 +--------------------------------------------------------------------------+

 | Semantic Segmentation                                                    |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | Input Src: /dev/video-usb-cam0                                           |

 | Model Name: ONR-SS-deeplabv3_test50                                      |

 | Model Type: segmentation                                                 |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+







 +--------------------------------------------------------------------------+

 | Semantic Segmentation                                                    |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | Input Src: /dev/video-usb-cam0                                           |

 | Model Name: ONR-SS-deeplabv3_test50                                      |

 | Model Type: segmentation                                                 |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+







 +--------------------------------------------------------------------------+

 | Semantic Segmentation                                                    |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | Input Src: /dev/video-usb-cam0   :      2676.40 ms   from   1    samples |

 +--------------------------------------------------------------------------+   724.402417 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1ion                                                 |
   724.402465 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
   724.402504 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
   724.402516 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
   724.402528 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
   724.402543 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
   724.402563 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
   724.402574 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
   724.402928 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
   724.402941 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
   724.402952 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!


 +--------------------------------------------------------------------------+

 | Semantic Segmentation                                                    |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | Input Src: /dev/video-usb-cam0   :      2676.40 ms   from   1    samples |

 +--------------------------------------------------------------------------+

 | Model Type: segmentation                                                 |

 +--------------------------------------------------------------------------+

 | dl-inference







 +--------------------------------------------------------------------------+

 | Semantic Segmentation                                                    |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | Input Src: /dev/video-usb-cam0   :      2676.40 ms   from   1    samples |

 +--------------------------------------------------------------------------+   726.979021 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1ion                                                 |
   726.979067 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
   726.979108 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
   726.979120 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
   726.979132 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
   726.979147 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
   726.979167 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
   726.979178 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
   726.979539 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
   726.979553 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
   726.979563 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!


 +--------------------------------------------------------------------------+

 | Semantic Segmentation                                                    |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | Input Src: /dev/video-usb-cam0   :      2603.45 ms   from   2    samples |
                                    :      2580.51 ms   from   1    samples |
 | total time: ONR-SS-deeplabv3_test:         0.39 fps  from   1    samples |
 | framerate
 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | dl-inference







 +--------------------------------------------------------------------------+

 | Semantic Segmentation                                                    |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | Input Src: /dev/video-usb-cam0   :      2603.45 ms   from   2    samples |
                                    :      2580.51 ms   from   1    samples |
 | total time: ONR-SS-deeplabv3_test:         0.39 fps  from   1    samples |
 | framerate
 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | dl-inference







 +--------------------------------------------------------------------------+

 | Semantic Segmentation                                                    |

 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | Input Src: /dev/video-usb-cam0   :      2603.45 ms   from   2    samples |
                                    :      2580.51 ms   from   1    samples |
 | total time: ONR-SS-deeplabv3_test:         0.39 fps  from   1    samples |
 | framerate
 +--------------------------------------------------------------------------+

 +--------------------------------------------------------------------------+

 | dl-inference 





----------------------------------------------------------------------------------------------------------




root@am62axx-evm:/opt/edgeai-gst-apps/apps_python# ./app_edgeai.py ../configs/sem
mantic_segmentation.yaml -n
libtidl_onnxrt_EP loaded 0x184cdf30
Final number of subgraphs created are : 3, - Offloaded Nodes - 257, Total Nodes - 261
APP: Init ... !!!
MEM: Init ... !!!
MEM: Initialized DMA HEAP (fd=5) !!!
MEM: Init ... Done !!!
IPC: Init ... !!!
IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
   372.761089 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
   372.763673 s:  VX_ZONE_INIT:Enabled
   372.763718 s:  VX_ZONE_ERROR:Enabled
   372.763729 s:  VX_ZONE_WARNING:Enabled
   372.765317 s:  VX_ZONE_INIT:[tivxInitLocal:130] Initialization Done !!!
   372.767101 s:  VX_ZONE_INIT:[tivxHostInitLocal:101] Initialization Done for HOST !!!
   373.042379 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
   373.042425 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
   373.042466 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
   373.042478 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
   373.042491 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
   373.042507 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
   373.042558 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
   373.042571 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
TIDL_RT_OVX: ERROR: Verifying TIDL graph ... Failed !!!
TIDL_RT_OVX: ERROR: Verify OpenVX graph failed
==========[INPUT PIPELINE(S)]==========

[PIPE-0]

v4l2src device=/dev/video-usb-cam0 brightness=133 contrast=5 saturation=83 ! capsfilter caps="image/jpeg, width=(int)1280, height=(int)720;" ! jpegdec ! tiovxdlcolorconvert ! capsfilter caps="video/x-raw, format=(string)NV12;" ! tiovxmultiscaler name=split_01
split_01. ! queue ! capsfilter caps="video/x-raw, width=(int)1280, height=(int)720;" ! tiovxdlcolorconvert out-pool-size=4 ! capsfilter caps="video/x-raw, format=(string)RGB;" ! appsink max-buffers=2 drop=True name=sen_0
split_01. ! queue ! capsfilter caps="video/x-raw, width=(int)752, height=(int)472;" ! tiovxmultiscaler target=1 ! capsfilter caps="video/x-raw, width=(int)224, height=(int)224;" ! tiovxdlpreproc out-pool-size=4 ! capsfilter caps="application/x-tensor-tiovx;" ! appsink max-buffers=2 drop=True name=pre_0


==========[OUTPUT PIPELINE]==========

appsrc do-timestamp=True format=3 block=True name=post_0 ! tiovxdlcolorconvert ! capsfilter caps="video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720;" ! queue ! mosaic_0.sink_0

tiovxmosaic target=1 background=/tmp/background_0 name=mosaic_0 src::pool-size=4
sink_0::startx="<320>" sink_0::starty="<150>" sink_0::widths="<1280>" sink_0::heights="<720>"
! capsfilter caps="video/x-raw, format=(string)NV12, width=(int)1920, height=(int)1080;" ! queue ! tiperfoverlay title=Semantic Segmentation ! kmssink sync=False max-lateness=5000000 qos=True processing-deadline=15000000 driver-name=tidss connector-id=40 plane-id=31 force-modesetting=True

   374.908303 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
   374.908356 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
   374.908395 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
   374.908408 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
   374.908421 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
   374.908438 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
   374.908456 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
   374.908467 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
   374.908823 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
   374.908837 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
   374.908848 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
   377.700137 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
   377.700185 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
   377.700226 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
   377.700239 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
   377.700252 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
   377.700267 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
   377.700290 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
   377.700302 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
   377.700658 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
   377.700672 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
   377.700683 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
   380.273830 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
   380.273879 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
   380.273919 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
   380.273932 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
   380.273945 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
   380.273961 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
   380.273978 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
   380.273990 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
   380.274342 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
   380.274356 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
   380.274368 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!

ONR-SS-deeplabv3_test50.zip

0 Reese Grimsley over 2 years ago in reply to Abhy K S

TI__Mastermind 18426 points

Hello Abhy,

You are welcome, I am glad to help. Thank you for including the model's compiled files. These are helpful. I can reproduce the error on my side, but I am not immediately sure what the solution is. I can see that the model is not initializing correctly.

Something I found is that visualization files (.SVG format) are not present in the artifacts directory. Could you provide the compilation logs? I wonder if a subgraph did not complete correctly. For some reason, the first subgraph says that it requires no memory allocation (which is not possible) and it fails during initialization. This seems to be happen for the first subgraph, but not the second or third.

I also note that the resize nodes are given scales now (thank you!) but the values will be problematic. They are both upscaling by a factor 28, which is not supported -- only factors of 2 are. See documnet here (line 15 in the table): https://github.com/TexasInstruments/edgeai-tidl-tools/blob/master/docs/supported_ops_rts_versions.md

- the second resize node is creating an odd output resolution. That one should rescale by factor of 8 to get to original 224x224 width x height. Those resize nodes may still be computed on Arm.

My suggestion to proceed:

- Provide me the compile log

- Modify scales again to be powers of 2 for resize node. For the one within the ASPP block, I'm not sure the best approach here since it needs to match the others going into the concat node

- Somewhere in the middle of the encoder (before the ASPP block I posted above that was problematic), please add another layer to the deny_list:layer_name to break up the first subgraph, since I'm seeing some issue here that I'd like to isolate.

I understand the intent with using a pretrained model from torchvision. I'd suggest looking at the models we have already enabled too, specifically mobilenetv2_deeplabv3plus-lite. This version has already been optimized to use our accelerator to the fullest extent.

Best,
Reese

0 Abhy K S over 2 years ago in reply to Reese Grimsley

Prodigy 20 points

ss-ort-deeplabv3_test52.zip ss-ort-deeplabv3_test52_compile_log.txt

Fullscreen deeplabv3_test52_execute_-n_log (1).txt Download

root@am62axx-evm:/opt/edgeai-gst-apps/apps_python# ./app_edgeai.py ../configs/sem
mantic_segmentation.yaml -n
libtidl_onnxrt_EP loaded 0x1d222540
Final number of subgraphs created are : 3, - Offloaded Nodes - 257, Total Nodes - 261
APP: Init ... !!!
MEM: Init ... !!!
MEM: Initialized DMA HEAP (fd=5) !!!
MEM: Init ... Done !!!
IPC: Init ... !!!
IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
  1976.616219 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
  1976.618809 s:  VX_ZONE_INIT:Enabled
  1976.618854 s:  VX_ZONE_ERROR:Enabled
  1976.618866 s:  VX_ZONE_WARNING:Enabled
  1976.620504 s:  VX_ZONE_INIT:[tivxInitLocal:130] Initialization Done !!!
  1976.622302 s:  VX_ZONE_INIT:[tivxHostInitLocal:101] Initialization Done for HOST !!!
  1976.895565 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
  1976.895706 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
  1976.895721 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
  1976.895734 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
  1976.895746 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
  1976.895763 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
  1976.895782 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
  1976.895794 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
TIDL_RT_OVX: ERROR: Verifying TIDL graph ... Failed !!!
TIDL_RT_OVX: ERROR: Verify OpenVX graph failed
==========[INPUT PIPELINE(S)]==========

[PIPE-0]

v4l2src device=/dev/video-usb-cam0 brightness=133 contrast=5 saturation=83 ! capsfilter caps="image/jpeg, width=(int)1280, height=(int)720;" ! jpegdec ! tiovxdlcolorconvert ! capsfilter caps="video/x-raw, format=(string)NV12;" ! tiovxmultiscaler name=split_01
split_01. ! queue ! capsfilter caps="video/x-raw, width=(int)1280, height=(int)720;" ! tiovxdlcolorconvert out-pool-size=4 ! capsfilter caps="video/x-raw, format=(string)RGB;" ! appsink max-buffers=2 drop=True name=sen_0
split_01. ! queue ! capsfilter caps="video/x-raw, width=(int)752, height=(int)472;" ! tiovxmultiscaler target=1 ! capsfilter caps="video/x-raw, width=(int)224, height=(int)224;" ! tiovxdlpreproc out-pool-size=4 ! capsfilter caps="application/x-tensor-tiovx;" ! appsink max-buffers=2 drop=True name=pre_0


==========[OUTPUT PIPELINE]==========

appsrc do-timestamp=True format=3 block=True name=post_0 ! tiovxdlcolorconvert ! capsfilter caps="video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720;" ! queue ! mosaic_0.sink_0

tiovxmosaic target=1 background=/tmp/background_0 name=mosaic_0 src::pool-size=4
sink_0::startx="<320>" sink_0::starty="<150>" sink_0::widths="<1280>" sink_0::heights="<720>"
! capsfilter caps="video/x-raw, format=(string)NV12, width=(int)1920, height=(int)1080;" ! queue ! tiperfoverlay title=Semantic Segmentation ! kmssink sync=False max-lateness=5000000 qos=True processing-deadline=15000000 driver-name=tidss connector-id=40 plane-id=31 force-modesetting=True

  1978.843807 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
  1978.843860 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
  1978.843899 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
  1978.843911 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
  1978.843924 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
  1978.843940 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
  1978.843958 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
  1978.843970 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
  1978.844326 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
  1978.844340 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
  1978.844352 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
  1980.608716 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
  1980.608764 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
  1980.608808 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
  1980.608821 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
  1980.608834 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
  1980.608852 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
  1980.608875 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
  1980.608889 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
  1980.609244 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
  1980.609259 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
  1980.609271 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
  1982.255881 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
  1982.256030 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
  1982.256046 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
  1982.256059 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
  1982.256072 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
  1982.256089 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
  1982.256109 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
  1982.256122 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
  1982.256551 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
  1982.256567 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
  1982.256579 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
  1983.884031 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
  1983.884176 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
  1983.884192 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
  1983.884205 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
  1983.884217 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
  1983.884233 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
  1983.884256 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
  1983.884268 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
  1983.884698 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
  1983.884714 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
  1983.884726 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
  1985.516745 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
  1985.516795 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
  1985.516858 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
  1985.516872 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
  1985.516885 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
  1985.516901 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
  1985.516921 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
  1985.516934 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
  1985.517289 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
  1985.517303 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
  1985.517314 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
  1987.138239 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
  1987.138293 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
  1987.138337 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
  1987.138351 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
  1987.138364 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
  1987.138380 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
  1987.138398 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
  1987.138411 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
  1987.138766 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
  1987.138796 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
  1987.138810 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
  1988.769243 s:  VX_ZONE_ERROR:[ownContextSendCmd:822] Command ack message returned failure cmd_status: -1
  1988.769299 s:  VX_ZONE_ERROR:[ownContextSendCmd:862] tivxEventWait() failed.
  1988.769315 s:  VX_ZONE_ERROR:[ownNodeKernelInit:527] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
  1988.769346 s:  VX_ZONE_ERROR:[ownNodeKernelInit:528] Please be sure the target callbacks have been registered for this core
  1988.769360 s:  VX_ZONE_ERROR:[ownNodeKernelInit:529] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
  1988.769377 s:  VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:5 ... failed !!!
  1988.769394 s:  VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
  1988.769406 s:  VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
  1988.769853 s:  VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:799] graph is not in a state required to be scheduled
  1988.769900 s:  VX_ZONE_ERROR:[vxProcessGraph:734] schedule graph failed
  1988.769913 s:  VX_ZONE_ERROR:[vxProcessGraph:739] wait graph failed
ERROR: Running TIDL graph ... Failed !!!

Hi Reese,

I tried changing the scales factor and yet the segmentation did not happen. I am sharing the compilation files,model artifcats and runtime logs for your reference. Kindly help.

Regards,
Abhy

0 Reese Grimsley over 2 years ago in reply to Abhy K S

TI__Mastermind 18426 points

Hello Abhy,

Thanks for sharing the files again. The compile log is very helpful.

In the compile log, I see a few strange things. For one, there is a warning about some parameters missing in the .ONNX file for a particular layer. There are also several buffer_overflows being thrown - this looks to be in the second subgraph. I notice that there is a warning for subgraph 2 that there will likely be a fault on the target because quantization did not complete. This may the primary source of your issue. The first and last subgraphs seem to be fine. This one that fails has an output (node 1061) earlier in the model.

I notice that you still include the Resize nodes in the deny_list. At the least the last Resize nodes before outputs (node 1061 and node "output") should be possible to offload since the scales are a multiple of 2. The one in the ASPP block uses scales of 28, so this might not offload well. It is still worth trying. Could you remove the Resize node from the deny_list?

I'm also seeing that the visualization SVG files are not being generated. It looks like a dependency fails to load for libcgraph. Does the "graphviz" python package import on your development PC? You may need to install a few packages through apt like: graphviz libgraphviz-dev. These SVG files can be helpful for debugging - they help visualize the graph parsed and compiled by TIDL, which may have some changes from the original ONNX model. This would show where the divisions in the graph are happening

Processors

Processors forum

PROCESSOR-SDK-AM62A: Deeplabv3_resnet101 not compiling