This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: How to run custom model on MMA

Part Number: TDA4VM

Hi, 

I am facing an issue while I am running my custom model on my board. I am using a TDA4VM-EVM with the SDK :  PROCESSOR-SDK-LINUX-SK-TDA4VM_08.01.00.02 | TI.com

I am currently trying to develop an image classification model on my own on tensorflow, this model is based on Conv2D, Maxpooling2D, Dropout, Flatten and Dense layer with relu and softmax activation.

In order to have a compatible environment with my computer and my EVM I am working on the edgeai_benchmark : GitHub - TexasInstruments/edgeai-benchmark: EdgeAI Deep Neural Network Models Benchmarking

I succeed to run my model on ARM with great performance (~80% of accuracy similar as my computer accuracy) but while I want to run my model on MMA the accuracy drops to 10%. Why this behaviour happened?

Currently, I am training the model on my computer and I tried with different ways to save my model (from_saved_models / from_keras_model) is there a recommended way to save my model and is there optional parameter to activate while I'm saving my model ?

Best regards,

Xavier

  • Hi Xavier,

    if you are using tensorflow to train the model, you can convert to tflite format.

    edgeai-tidl-tools and edgeai-benchmark understands tflite and onnx formats (and few others). 

    Can you describe more about the 10% accuracy drop scenario - how did you measure the accuracy? When you use edgeai-benchmark for model compilation, it reports accuracy after compiling the model, in the Pc itself. What accuracy did it report?

  • Hi Manu, 

    Thank yo for your reply.

    For sure, I measure the accuracy of my model using sklearn.metrics.classification_report and scikitplot.metrics.confusion_matrix.

    Here you can see the results I obtained using the model on PC :

       

    Here you can see the results I obtained using the model on ARM : 

    Here you can see the results I obtained using the model on MMA : 

    I saved my model this way in order to generate a .tflite file: 

    I compile the artifacts this way : 

    and load them in the interpreter this way : 

    Best regards,

    Xavier

  • There are couple of ways to debug this situation:

    Step 1: Disable TIDL offload and runt he model in ARM only mode. This can be achieved by having only 'CPUExecutionProvider' in the ep_list that you use while creating the onnxruntime.InferenceSession. Is this what you meant by the ARM only mode that you described above?

    Step 2: Run with TIDL offload in float mode. You can do this by setting tensor_bits to 32 (instead of 8). This will run TIDL in float mode (works only in PC) - this is a simple way to flow flush TIDL to make sure there are no obvious bugs.

    If this gives good accuracy, but when you set tensor_bits to 8 it gives poor accuracy, this it is probably due to quantization error. You can use calib_images from your dataset (recommend 50 images for good accuracy) and also use good number of iterations (recommend 50).

  • Have you tried? Is the issue resolved?

  • HI Manu,

    I've tried your solutions but nothing change on my MMA predictions. the same confusion matrix and equivalent accuracy shows off.

    I aim to operate my model on MMA not on ARM, I just checked my model on ARM to check if everything was ok on the EVM.

    I am using CIFAR10 datasets to train and test my model does this error could come from my preprocessing I set my parameters as follow:

    size = [32 32]
    
    mean = [ 110.024414, 110.76758, 104.36621]
    
    scale = [0.00908889185, 0.00902791232, 0.00958164525]
    
    layout = 'NHWC'
    
    reverser_channels = False

    And should I use converter attributes such as : 

    converter.target_spec.experimental_select_user_tf_ops =[ '' ]
    
    converter.target_spec.supported_ops = [
    
        tf.lite.OpsSet.TFLITE_BUILTINS,
        tf.lite.OpsSet.SELECT_TF_OPS
    
    ]
    
    converter.allow_custom_ops = True

    Best regards,

    Xavier

  • Hi Xavier,

    Are you saying that the accuracy is still poor when you set tensor_bits to 32? That is just float operation and quantization errors doesn't come into play. So the reason for this accuracy degradation is not quantization. Something else.

    Is it possible to share the tflite model?

  • Hi Manu,

    Yes exactly, the accuracy remains poor. Here is the tflite file used for this results : 

    converted_model_2D_W9_keras.zip

    Best regards, 

    Xavier

  • The model is rather simple and I couldn't find anything special in this model that would cause issue. We have validated several models with these kind of layers. I have forwarded it to my colleagues - let me see what they have to say.

  • My colleague confirmed that there is indeed an issue with the model - we need to investigate further to try and fix it.

  • Hi Xavier,

    We are investigating this further. JIRA id (for internal reference) : https://jira.itg.ti.com/browse/TIDL-2515

    Regards,

    Anand 

  • Hi,

    I succeed to solve this issue on MMA by replacing flatten layer with a GlobalAveragePooling layer. Let me know if you have more information about the Flatten layer and why this error occurs please. I am really interested to use the flatten layer in my next models.

    Best Regards, 

    Xavier

  • Hi Xavier, thanks for sharing the information. I will check this issue and get back with update.

    Regards,

    Anand

  • Hi Xavier,

    An update on this thread: Currently TIDL flatten layer is supported in  NCHW layout format, the model you are referring to (tflite) does it  in NHWC layout format which is not supported on TIDL and hence needs to be delegated to ARM instead of DSP. This is resolved as part of the upcoming SDK 8.5 release.

    Please let me know if there are any further questions.

    Regards,

    Anand