TDA4VL-Q1: Custom semantic segmentation ONNX model using channel-first (NCHW) ordering errors out with onnxrt_ep.py.

Part Number: TDA4VL-Q1

I've got a custom ONNX semantic segmentation model that does not have an 'argmax' operator for its output.  It outputs a channel-first (NCHW) tensor: 1x4x576x960.  If I try to run this through the edgeai-tidl-tools onnxrt_ep.py tool, the tool reports an error because the TI code assumes a channel-last ordering (NHWC):

root@088f260a8b34:/home/root/shared_with_docker/edgeai-tidl-tools/examples/osrt_python/ort# python3 ./onnxrt_ep.py -d -m ss-ort-800k-model-f1
Available execution providers :  ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']

Running 1 Models - ['ss-ort-800k-model-f1']


Running_Model :  ss-ort-800k-model-f1

Batch: 1
Debug:
(4, 576, 960)
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/root/shared_with_docker/edgeai-tidl-tools/examples/osrt_python/ort/./onnxrt_ep.py", line 267, in run_model
    classes, image = seg_mask_overlay(output[0][j],imgs[j])
  File "/home/root/shared_with_docker/edgeai-tidl-tools/examples/osrt_python/common_utils.py", line 291, in seg_mask_overlay
    org_image[:,:, 1] = mask_image[:,:, 1]
ValueError: could not broadcast input array from shape (4,576) into shape (576,960)
^CTraceback (most recent call last):
  File "/home/root/shared_with_docker/edgeai-tidl-tools/examples/osrt_python/ort/./onnxrt_ep.py", line 329, in <module>
    nthreads = join_one(nthreads)
  File "/home/root/shared_with_docker/edgeai-tidl-tools/examples/osrt_python/ort/./onnxrt_ep.py", line 311, in join_one
    sem.acquire()
KeyboardInterrupt                                                                                                                                                                                                                                                                              

It turns out that the seg_mask_overlay() method in common_utils.py applies argmax() when the squeeze'd output has a ndim > 2.  It assumes index 2 is the axis (axis=2) whereas for my output it should be referencing index 0 for the axis (axis=0).

What's the best way to address this?  I can hand-modify the common_utils.py, but shouldn't there be a parameter defined in model_configs.py for models to indicate channel ordering?  That way, seg_mask_overlay() doesn't have to make an assumption (which is incorrect in this use case).