This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM69A: Getting float model not int8

Part Number: AM69A

Tool/software:

Hi

I am trying the custom instance segmentation model. I have exported it to onnx runtime and able to compile it successfully on edgeai-tidl tool version 09_02_07 .

After compiling, I ran on host machine inside tidl docker with and without tidl offloaded and found everytime it is running fine and I can visualize the results too.

Then I tried on am69a board the same model using compiled model artifacts and found I am not getting any detections. Then I checked more and while preprocessing, I found this -

def infer_image(sess, image_files, config):
  input_details = sess.get_inputs()
  input_name = input_details[0].name
  floating_model = (input_details[0].type == 'tensor(float)')
  height = input_details[0].shape[2]
  width  = input_details[0].shape[3]
  channel = input_details[0].shape[1]
  batch  = input_details[0].shape[0]
  print("height, width, channel, batch, floating_model: ", height, width, channel, batch, floating_model)
  # height, width, channel, batch, floating_model:  550 550 3 1 True
  

The model is showing as floating point model, not int8. Even though while compiling, I sat the tensor_bits=8. Not sure if I am missing anything else. I am attaching the model here too. It is trained on ms coco dataset.

My settings in common_utils.py

tensor_bits = 8
debug_level = 0
max_num_subgraphs = 16
accuracy_level = 1
calibration_frames = 2
calibration_iterations = 5
output_feature_16bit_names_list = ""#"conv1_2, fire9/concat_1"
params_16bit_names_list = "" #"fire3/squeeze1x1_2"
mixed_precision_factor = -1
quantization_scale_type = 0
high_resolution_optimization = 0
pre_batchnorm_fold = 1
inference_mode = 0
num_cores = 1
ti_internal_nc_flag = 1601

Model - 8182.yolact_resnet18_54_400000_v2.zip

Thanks

Akhilesh

  • Hi Abhilash,

    First of all i would highly recommend to try out the latest import tools 9.2.9.0 for this experiment

    The model is showing as floating point model, not int8. Even though while compiling, I sat the tensor_bits=8. Not sure if I am missing anything else. I am attaching the model here too. It is trained on ms coco dataset.

    In the shared folder i dont see SVG file attached, it would be worth to checkout them to understand quantization style, can you please attach the svg files for my understanding purpose ?

    Then I tried on am69a board the same model using compiled model artifacts and found I am not getting any detections. Then I checked more and while preprocessing, I found this -

    Also can you explain what you wanted to convey from above code snippet exactly ? Seems like first layer of model is float values, ideally there should be dataconvert layer followed by this (since the svg is not shared am limited to check that about) 

  • Hi Pratik,

    I am attaching the compiled model too. 

    Thanks

    4857.yolact_resnet18_54_400000_v2.zip

  • Akhilesh,

    due to limited bandwidth i will not able to look into this issue at the moment.

    I will try prioritize this coming week.

    Thanks 

  • Hi Prateek, any update on this?

    Thanks

  • Hi Akhilesh,

    I have checked the artifact folder and from SVG file I can confirm that the model is quantized to 8 bit ! So am wondering how are you coming to a conclusion that this is not quantized.

    For the matter of the fact the input tensor to quantized model will be float 32 as we have data convert layers.

    can you share the debug_level 2 logs on target ? also please make sure you run vision_apps_init.sh before invoking call to infer session.

    Thanks.

  • The model is showing as floating point model, not int8

    I can check gain and will let you know. But please refer to this code snippet.

    Thanks

    Akhilesh

  • Taking glance at infer_image method, it seems like you passing img_files (preferably numpy arrays) as input, 

    floating_model = (input_details[0].type == 'tensor(float)')

    The above line tells me that numpy array that you are passing is float and i thing the first layer of model expect the same followed by data convert to quantized and carry forward next operations in fixed point arithmetic.

    Let me know am i missing anything here ? with sharded code snippets this i my understanding so far.

    Thanks 

  • Hi Pratik,

    I have not passed img yet, that's the filename only. Also, if you see this -

      input_details = sess.get_inputs()
      input_name = input_details[0].name
      floating_model = (input_details[0].type == 'tensor(float)')
      height = input_details[0].shape[2]
      width  = input_details[0].shape[3]
      channel = input_details[0].shape[1]
      batch  = input_details[0].shape[0]
      print("height, width, channel, batch, floating_model: ", height, width, channel, batch, floating_model)

    I am checking from input_details. There is not image file involved. So the floating_model flag is purely from the session. My question, why this flag is coming as f32 when I had compiled the model for int8?

  • As explained,

    floating_model = (input_details[0].type == 'tensor(float)')

    The above line tells me that numpy array that you are passing is float and i thing the first layer of model expect the same followed by data convert to quantized and carry forward next operations in fixed point arithmetic.

    I think this still holds true for the shared model.

    To verify model is executing in fixed point manner you can enable the debug_level 3 traces and you should see .y(fixed point layer op) along with .bin dequantized layer level output, can you try doing this ?

  • Hi Pratik,

    I tried to set debug_level 3 and the device got hanged at thsi point -

    Compute on node : TIDLExecutionProvider_TIDL_0_0
    ************ in TIDL_subgraphRtCreate ************
     APP: Init ... !!!
    MEM: Init ... !!!
    MEM: Initialized DMA HEAP (fd=5) !!!
    MEM: Init ... Done !!!
    IPC: Init ... !!!
    IPC: Init ... Done !!!
    REMOTE_SERVICE: Init ... !!!
    REMOTE_SERVICE: Init ... Done !!!
    172889.301517 s: GTC Frequency = 200 MHz
    APP: Init ... Done !!!
    172889.301583 s:  VX_ZONE_INIT:Enabled
    172889.301596 s:  VX_ZONE_ERROR:Enabled
    172889.301605 s:  VX_ZONE_WARNING:Enabled
    172889.302150 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:116] Added target MPU-0
    172889.302269 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:116] Added target MPU-1
    172889.302353 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:116] Added target MPU-2
    172889.302456 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:116] Added target MPU-3
    172889.302470 s:  VX_ZONE_INIT:[tivxInitLocal:136] Initialization Done !!!
    172889.302949 s:  VX_ZONE_INIT:[tivxHostInitLocal:101] Initialization Done for HOST !!!
    

    Even restarting and setting level to 2 again is not helping. I had to reflash the sd card. Is there anything I can do to avoid this error?

  • This is strange.

    Can you please reflash the sd card and run vision_apps_init.sh followed by model inference ?

    Secondly,

    We can also check the debug_level 2 or 3 logs on host typically .y files means dump is in fixed point and layer executed in fixed point flow only.