This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: TIDL faulty asymmetric resize implementation

Part Number: TDA4VM

Hi Everyone,

While debugging our custom NN import to TIDL we experienced faulty TIDL execution for asymmetric resize layers. We ran the model and dumped the tensor values as float from with our own ONNX PC execution and compared them with the TIDL PC emulation in float mode for the same input. We calculated the differences from the reference tensor values and found that asymmetric resize layers produce wrong outputs in TIDL. 

Running the PC float emulation, the asymmetric resize is executed in a symmetric way, and the asymmetric component is replaced with zero padding. In the picture below you can see the output of the first asymmetric resize layer in our network (asymmetry is 2-1 in width). Top tensor is the correct value taken from our ONNX run, middle tensor is the TIDL execution output, bottom tensor is the difference plotted as n image. We think this is a bug with asymmetric resize implementation in TIDL.

Thanks,
Marton

Importer environment: 
Ubuntu 18.04 LTS 
SDK: 08_00_00_12 

  • Hi,

    When you say asymmetric, does it mean that only one dimension is getting expanded ?  I guess it is 2x expansion in width direction in your case and no expansion in height direction.. If so then this kind of asymmetric resize is currently not tested/targeted, however we are looking into this issue to reproduce this issue here. 

    If possible please point to the network where this kind of resize is used, this will help in testing.

    Regards

    Deepak Poddar

  • Hi Deepak,

    I can't share our whole network but I could create a small one layer reproducer for you that reproduces the issue. The problem happens in asymmetric downsample cases. It looks to me that half of the tensor is filled continuously with the middle row/column values producing incorrect output.

    The example network takes a 1x32x32 input and downscales 2-1 ratio. I attached the reference output from onnx execution with nnef_tools and TIDL PC emulation below. I also attached the network and import/inference configuration in a zip file.

    Input tenzor:

    Output of neef_tools:

    Output of TIDL execution:

    The same happens in width direction as well:

    I would also like to point out that TIDL model importer only gives a warning as non-symmetric resize is not optimal which is quite misleading as they are not supported/validated. This error message suggest that they should work but they will be slow. That is clearly not the case.

    ****************************************************
    **               TIDL Model Checker               **
    ****************************************************
    SUGGESTION: [TIDL_ResizeLayer] Resize0 Resize Layer with non-symmetric resize ratio across width and height is not optimal.
    

    We hope you can provide a fix for us now that the issue is identified.

    Thanks,
    Marton

    downsample_h.zip

  • Regarding asymmetric resize, since TIDL-RT natively doesn’t handle it, we suggest to use open source run time interface (TFLite RT or ONNX RT) which will allow asymmetric resize layer to run on ARM.

    For more details on TFIteRT or ONNXRT usage, please refer below link

    https://github.com/TexasInstruments/edgeai-tidl-tools

  • Hello Karthik,

    Unfortunately it is not an option for us to offload some of the neural network operations to the ARM CPU.

    The ARM CPU executes our ADAS application, so we cannot spare ARM CPU load on the NN execution. We use NN accelerators to offload the compute heavy calculations from the CPU in the first place.

    Moreover there multiple resize layers in our inside our network solution so that would mean splitting the network into multiple subgraphs and causing back and forth communication / data exchange between the C7X and CPU, which is quite sub-optimal.

    Best regards,

    Gábor

  • Hi Karthik,

    I experimented with the edgeai tools for some time now. Unfortunately I could not manage to complete a model import with it. Our network takes 6 input tensors as raw binary data which are the preprocessed YUV input data planes. The python environment seems to only work with image inputs as I get numpy errors in several places and I did not manage to fix them all.

    My other concern is that our application runs the openVX graph in pipeline mode to hit the performance target. I fear that with the layer offloading and dividing the NN graph to multiple sub graphs the execution time wont be sufficient for our end application. I was able to run the runtimes visualization portion from the edgeai tools and the produced graph has 12 groupings which seems a lot compared to the examples provided.

    Thanks,
    Marton