This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VL-Q1: different inference result between PC Emulation mode and board mode

Part Number: TDA4VL-Q1
Other Parts Discussed in Thread: TDA4VL

Tool/software:

hi guys,

i use ti-processor-sdk-rtos-j721s2-evm-09_02_00_05.tar this version for my tda4vl program

and i use this parameter to convert my onnx model to ti bin

---------------------------------------

modelType = 2
numParamBits = 8
numFeatureBits = 8
quantizationStyle = 2

mixedPrecisionFactor = 1.2

……

--------------------------------------

because and get the output layer as 16bit

and the model bin get the right result at PC emulation mode

but when i run the model on the board 

the result are totally wrong!

And when i just remove the mixedPrecisionFactor = 1.2

and get the normal 8bit layer output.

i can get the right result on the board too.

but the model would get lower percision  than 16bit model

by the way , i use the same parameter on ti-processor-sdk-rtos-j721s2-evm-08_06_01_03.tar can get the right result for 16bit model. 

so i not sure the different between this two version. 

any one can help me to fix this problem?

  • Hi,

    Thanks for your question, i have assigned this thread to our analytics expert.

    Please expect reply from them.

    Thanks 

  • Hello,

    Summarizing the behavior you are observing below:

    - Using TIDL-RT flow with ti-processor-sdk-rtos-j721s2-evm-09_02_00_05

    - With automated mixed precision set to 1.2 you achieve the correct result with inference on host emulation, however transferring the model artifacts to the EVM, you are getting an incorrect result

    - With 8-bit fixed point precision, this mismatch does not occur (but is not precise enough for your use case)

    To debug the above issue further, can you provide the layer level traces (done with writeTraceLevel = 1) from both host emulation and target inference and compare to see where the functional mismatch is occurring? The steps to do this are documented here under Steps to Debug Error Scenarios for target(EVM) execution. In addition, if you are able to provide the model and your import configuration file that you using that will be helpful to look at on our end. 

    Best,

    Asha

  • hello, i have checked your method 

     when i set the paramter "writeTraceLevel = 1" on HOST EMULATION it would get a "segment fault" at 7 layer for our model.

    in this function "vxProcessGraph"

    but it did generate 7 log files 

    and board could get the whole log files by set "writeTraceLevel = 1"

    but when i compare these to files

    it almost difference between then.

     

    this is my config file 

    because our model is important so i can not send it to the website  sorry about that.

  • Hi,

     when i set the paramter "writeTraceLevel = 1" on HOST EMULATION it would get a "segment fault" at 7 layer for our model.

    I would not expect this to happen - from my understanding, without this flag you are not experiencing this segmentation fault? Can you provide the log for this?

    From the last part, it seems you are saying that every layer has a difference? Can you clarify? What is the first layer you are seeing the difference at?

    Best,

    Asha

  • I would not expect this to happen - from my understanding, without this flag you are not experiencing this segmentation fault? Can you provide the log for this?

    reply: yes, without this flag the program would run and get the inference result. 

    fault log

    when the program run to "status = vxProcessGraph(obj->graph);“

    and can not found these two tmp files.

    this is the first layer model output the left one was board result and the right one was pc result. 

  • Hi Asha,

    Using the same 16bit model runs fine in SDK8.6, could you please tell me how to downgrade the tidl library version to SDK8.6?

    Thanks
    Regards
    quanfeng

  • we have tried to use SDK 10.0 to convert this model to 16bit (not mix precision way )by set 

    and we could get the right inference result

    but when we use mixedPrecisionFactor 

    it would get core dumped abort as this log infomation

    so your sdk 10.0 may have some bugs  on mixedPrecisionFactor mode 

    suggest you could fix it !

  • Hi 

    Can you try with the latest release of edgea-tidl-tools to import your model:https://github.com/TexasInstruments/edgeai-tidl-tools

    We update this tool very frequently and many bugs are fixed in every release.

    Regards,

    Adam

  • Hi 

    And to notify the latest release is 10_00_08. 

  • we use 10_00_08 to convert our model and get the right inference result.

    thanks for your support !