TDA4VM: Inference with Custom Artifacts Kills Kernel in Edge AI Studio

Whitney Knitter

Part Number: TDA4VM

I'm attempting rerun calibration inference on the pre-trained YOLO X human pose estimation model in Model Zoo and then write a custom post processing function that adds on the the keypoints+skeleton drawing for judging the pose itself.

I’m using the Edge AI Cloud tool following the custom ONNX model example but using the Human Pose Estimation as a reference. However, after I run the calibration inference, I can’t seem to run interference again with the custom artifacts generated from the calibration run. Even in a totally new notebook, whenever I point to my custom artifacts in the rt.InferenceSession() function, it kills the kernel every time I try to run it.

I’m not sure if I maybe just don’t have the directory structure set up right, or if I missed something in the backend. I’ve attached the Python code from my notebook I created in the Edge AI Cloud tool along with my log files. The error log just shows a double free or corruption error, but I couldn’t figure out the specific issue causing that error.

Fullscreen

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#!/usr/bin/env python
# coding: utf-8
# In[1]:
import os
import re
import sys
import cv2
import tqdm
import onnx
import math
import copy
import shutil
import platform
import itertools
import numpy as np
import onnxruntime as rt
import ipywidgets as widgets
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

#!/usr/bin/env python
# coding: utf-8

# In[1]:


import os
import re
import sys
import cv2
import tqdm
import onnx
import math
import copy
import shutil
import platform
import itertools

import numpy as np
import onnxruntime as rt
import ipywidgets as widgets
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches

from pathlib import Path
from munkres import Munkres
from numpy.lib.stride_tricks import as_strided
from IPython.display import Markdown as md
from PIL import Image, ImageFont, ImageDraw, ImageEnhance
from scripts.utils import imagenet_class_to_name, download_model, loggerWritter, get_svg_path, get_preproc_props, single_img_visualise


# In[2]:


def preprocess_for_onnx_pose_estimation(image_path, size, mean, scale, layout, reverse_channels, pad_color=114, pad_type="center"):
    # Step 1
    # read the image using openCVimport json_tricks as json
    img = cv2.imread(image_path)
    
    # Step 2
    # convert to RGB
    img = img[:,:,::-1]
    
    # Step 3    
    # Most of the onnx models are trained using
    # 512x512 images. The general rule of thumb
    # is to scale the input image while preserving
    # the original aspect ratio so that the
    # longer edge is 512 pixels, and then
    # pad the scaled image to 512x512
    
    size = (size,size) if not isinstance(size, (list,tuple)) else size
    desired_size = size[-1]
    old_size = img.shape[:2] # old_size is in (height, width) format

    ratio = float(desired_size)/max(old_size)
    new_size = tuple([int(x*ratio) for x in old_size])

    # new_size should be in (width, height) format
    img = cv2.resize(img, (new_size[1], new_size[0]))

    delta_w = size[1] - new_size[1]
    delta_h = size[0] - new_size[0]

    if pad_type=="corner":
        top, left = 0, 0
        bottom, right = delta_h, delta_w
    else:
        delta_w = size[1] - new_size[1]
        delta_h = size[0] - new_size[0]
        top, bottom = delta_h//2, delta_h-(delta_h//2)
        left, right = delta_w//2, delta_w-(delta_w//2)


    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT,
        value=pad_color)
    
    # Step 4
    # Apply scaling and mean subtraction.
    # if your model is built with an input
    # normalization layer, then you might
    # need to skip this
    if mean is not None and scale is not None:
        img = img.astype('float32')
        for mean, scale, ch in zip(mean, scale, range(img.shape[2])):
            img[:,:,ch] = ((img.astype('float32')[:,:,ch] - mean) * scale)
            
    # Step 5
    if reverse_channels:
        img = img[:,:,::-1]
    
    # Step 6
    img = np.expand_dims(img,axis=0)
    img = np.transpose(img, (0, 3, 1, 2))
    
    return img, top, left, ratio


# In[3]:


calib_images = [
'sample-images/yoga0.jpg',
'sample-images/yoga1.bmp',
'sample-images/yoga2.bmp',
]

output_dir = '../custom-artifacts-temp/onnx/yolox_s_pose_ti_lite_49p5_78p0.onnx'
#onnx_model_path_TDA4VM = '/opt/model_zoo/ONR-KD-7060-human-pose-yolox-s-640x640/model/yolox_s_pose_ti_lite_49p5_78p0.onnx'
#onnx_model_path_EdgeAIcloud = '/home/root/notebooks/model-zoo/models/vision/keypoint/coco/edgeai-yolox/yolox_s_pose_ti_lite_49p5_78p0.onnx'
onnx_model_path_EdgeAIcloud = '/home/root/notebooks/prebuilt-models/8bits/kd-7060_onnxrt_coco_edgeai-yolox_yolox_s_pose_ti_lite_49p5_78p0_onnx/model/yolox_s_pose_ti_lite_49p5_78p0.onnx'
onnx.shape_inference.infer_shapes_path(onnx_model_path_EdgeAIcloud, onnx_model_path_EdgeAIcloud)

#compilation options - knobs to tweak 
num_bits =8
accuracy =1

log_dir = Path("logs").mkdir(parents=True, exist_ok=True)

# stdout and stderr saved to a *.log file.  
with loggerWritter("logs/custon-model-onnx"):
    
    # model compilation options
    compile_options = {
        'tidl_tools_path' : os.environ['TIDL_TOOLS_PATH'],
        'artifacts_folder' : output_dir,
        'tensor_bits' : num_bits,
        'accuracy_level' : accuracy,
        'advanced_options:calibration_frames' : len(calib_images), 
        'advanced_options:calibration_iterations' : 3, # used if accuracy_level = 1
        'debug_level' : 1,
        'deny_list' : "MaxPool" #Comma separated string of operator types as defined by ONNX runtime, ex "MaxPool, Concat"
    }

# create the output dir if not present
# clear the directory
os.makedirs(output_dir, exist_ok=True)
for root, dirs, files in os.walk(output_dir, topdown=False):
    [os.remove(os.path.join(root, f)) for f in files]
    [os.rmdir(os.path.join(root, d)) for d in dirs]


# In[4]:


# create & compile model with compile options specified above 
so = rt.SessionOptions()
EP_list = ['TIDLCompilationProvider','CPUExecutionProvider']
sess = rt.InferenceSession(onnx_model_path_EdgeAIcloud ,providers=EP_list, provider_options=[compile_options, {}], sess_options=so)

input_details = sess.get_inputs()
print('input_details: ', input_details)

output_details = sess.get_outputs()
print('output_details: ', output_details)


label = 'ONR-KD-7060-human-pose-yolox-s-640x640'
pad_color = 128 if 'ae' in label and 'yolo' not in label else 114
pad_type = "corner" if 'yolox' in label else "center"
size = 640
mean = [0.0, 0.0, 0.0]
scale = [1.0, 1.0, 1.0]
layout = 0
reverse_channels = True

# run inference for each calibration image 
for num in tqdm.trange(len(calib_images)):
    #output = list(sess.run(None, {input_details[0].name : preprocess_for_onnx_pose_estimation(calib_images[num], size, mean, scale, layout, reverse_channels, pad_color, pad_type)}))[0]
    image_name = calib_images[num]
    print('label = ', label)
    print('pad_color = ', pad_color)
    print('pad_type = ', pad_type)
    print('image_name = ', image_name)
    processed_image, top, left, ratio = preprocess_for_onnx_pose_estimation(image_name, size, mean, scale, layout, reverse_channels, pad_color, pad_type)
    
    print('processed_image', processed_image)
    print('top', top)  
    print('left', left)
    print('ratio', ratio)
    
    if not input_details[0].type == 'tensor(float)':
        processed_image = np.uint8(processed_image)

    image_size = processed_image.shape[3]    
    print('image_size = ', image_size)
    out_file=None
    output=None
    output = list(sess.run(None, {input_details[0].name : processed_image})) #[0]
    print('output = ', output)
    
    #%matplotlib inline
    #output_image = single_img_visualise(output, image_size, calib_images[num], out_file, top, left, ratio, udp=True, thickness=2, radius=5, label=label)

    # plot the outut using matplotlib
    #plt.rcParams["figure.figsize"]=20,20
    #plt.rcParams['figure.dpi'] = 200 # 200 e.g. is really fine, but slower
    #plt.imshow(output_image)
    #plt.show()


# In[5]:


# subgraphs visualization for debugging
subgraph_link =get_svg_path(output_dir) 
for sg in subgraph_link:
    hl_text = os.path.join(*Path(sg).parts[4:])
    sg_rel = os.path.join('../', sg)
    display(md("[{}]({})".format(hl_text,sg_rel)))


# In[6]:


root_src_dir = output_dir
root_dst_dir = 'custom-artifacts/onnx/yolox_s_pose_ti_lite_49p5_78p0.onnx'

for src_dir, dirs, files in os.walk(root_src_dir):
    dst_dir = src_dir.replace(root_src_dir, root_dst_dir, 1)
    if not os.path.exists(dst_dir):
        os.makedirs(dst_dir)
    for file_ in files:
        src_file = os.path.join(src_dir, file_)
        dst_file = os.path.join(dst_dir, file_)
        if os.path.exists(dst_file):
            os.remove(dst_file)
        shutil.copy(src_file, dst_dir)


# In[7]:


del so
print('closed calibration inference session...')


# In[8]:


# use compiled image for inference 
out_file=None
image_name = 'sample-images/yoga0.jpg'
delegate_options = {
    'artifacts_folder': './custom-artifacts/onnx/yolox_s_pose_ti_lite_49p5_78p0.onnx'
}
print('delegate_options: ', delegate_options)


# In[9]:


so0 = rt.SessionOptions()
EP_list = ['TIDLCompilationProvider','CPUExecutionProvider']
print('EP_list: ', EP_list)


# In[ ]:


sess0 = rt.InferenceSession(onnx_model_path_EdgeAIcloud ,providers=EP_list, provider_options=[delegate_options, {}], sess_options=so0)
print('session0 started')


# In[ ]:


input_details0 = sess0.get_inputs()
print('input_details0: ', input_details0)


# In[ ]:


processed_image, top, left, ratio = preprocess_for_onnx_pose_estimation(image_name, size, mean, scale, layout, reverse_channels, pad_color, pad_type)

if not input_details[0].type == 'tensor(float)':
    processed_image = np.uint8(processed_image)

image_size = processed_image.shape[3]    
output0 = list(sess0.run(None, {input_details0[0].name : processed_image}))[0]


# In[ ]:


# post processing 
get_ipython().run_line_magic('matplotlib', 'inline')
output_image = single_img_visualise(output0, image_size, image_name, out_file, top, left, ratio, udp=True, thickness=2, radius=5, label=label)

# plot the outut using matplotlib
plt.rcParams["figure.figsize"]=20,20
plt.rcParams['figure.dpi'] = 200 # 200 e.g. is really fine, but slower
plt.imshow(output_image)
plt.show()

custon-model-onnx_err.log

Fullscreen

1
double free or corruption (!prev)
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

double free or corruption (!prev)

2821.custon-model-onnx_out.log

over 2 years ago

0 Paula Carrillo over 2 years ago

TI__Mastermind 36830 points

Hi Knitter, probably what is missing is to add "meta_arch_type" and "meta_layers_name_list" to "compile_options". Example below:

# stdout and stderr saved to a *.log file.
with loggerWritter("logs/custon-model-onnx"):
compile_options = {
'tidl_tools_path' : os.environ['TIDL_TOOLS_PATH'],
'artifacts_folder' : output_dir,
'tensor_bits' : 8,
'accuracy_level' : 1,
'advanced_options:calibration_frames' : len(calib_images),
'advanced_options:calibration_iterations' : 3, # used if accuracy_level = 1
'object_detection:meta_arch_type': 6,
'object_detection:meta_layers_names_list': f'CustomModels/Yolop/yolop_640_ti_lite_metaarch.prototxt',
}

You can find *.prototxt files inside model-zoo. Example path: /home/root/notebooks/model-zoo/modelartifacts/8bits/kd-7060_onnxrt_coco_edgeai-yolox_yolox_s_pose_ti_lite_49p5_78p0_onnx/model/yolox_s_pose_ti_lite_metaarch.prototxt

Also, you can comment (delete) 'deny_list' : "MaxPool". This was added only as an example to create subgraphs, as a possible tool for debugging by denying layers which could have an issue.

thank you,

Paula

0 Whitney Knitter over 2 years ago in reply to Paula Carrillo

Prodigy 10 points

Hi Paula!

Thanks for your reply! I updated my compile options to:

    compile_options = {
        'tidl_tools_path' : os.environ['TIDL_TOOLS_PATH'],
        'artifacts_folder' : output_dir,
        'tensor_bits' : num_bits,
        'accuracy_level' : accuracy,
        'advanced_options:calibration_frames' : len(calib_images),
        'advanced_options:calibration_iterations' : 3, # used if accuracy_level = 1
        'object_detection:meta_arch_type': 6,
        'object_detection:meta_layers_names_list': f'/home/root/notebooks/prebuilt-models/8bits/kd-7060_onnxrt_coco_edgeai-yolox_yolox_s_pose_ti_lite_49p5_78p0_onnx/model/yolox_s_pose_ti_lite_metaarch.prototxt'
    }

The kernel still dies after I try to run inference with my custom artifacts. Should I also be adding the "meta_arch_type" and "meta_layers_name_list" options to the delegate options I pass for the inference being ran with my custom artifacts?:

delegate_options = {
'artifacts_folder': './custom-artifacts/onnx/yolox_s_pose_ti_lite_49p5_78p0.onnx'
}

so0 = rt.SessionOptions()
EP_list = ['TIDLCompilationProvider','CPUExecutionProvider']

sess0 = rt.InferenceSession(onnx_model_path_EdgeAIcloud ,providers=EP_list, provider_options=[delegate_options, {}], sess_options=so0)

0 Paula Carrillo over 2 years ago in reply to Whitney Knitter

TI__Mastermind 36830 points

Hi Whitney, yes please use same compilation options as delegate options. Let me share with you a notebook example for yolox_s_lite that I have, in case it helps.

thank you,

Paula

custom-model-onnx-yolovx.ipynb

0 Paula Carrillo over 2 years ago in reply to Paula Carrillo

TI__Mastermind 36830 points

Also, for "yolox_s_pose_ti_lite" model I see we use mix precision to improve accuracy. Example below

'advanced_options:output_feature_16bit_names_list': '513, 758, 883, 1008, 756, 753, 878, 881, 1003, 1006',
'object_detection:meta_arch_type': 6,
'object_detection:meta_layers_names_list': f'prebuilt-models/8bits/kd-7060_onnxrt_coco_edgeai-yolox_yolox_s_pose_ti_lite_49p5_78p0_onnx/model/yolox_s_pose_ti_lite_metaarch.prototxt',

thank you,

Paula

0 Whitney Knitter over 2 years ago in reply to Paula Carrillo

Prodigy 10 points

Hi Paula!

Thank you for providing the example notebook. I tried running it as-is (pointing at a different sample image since the one specified didn't exist for me), however it also causes the kernel to die when calling the sess = rt.InferenceSession() function.

0 Paula Carrillo over 2 years ago in reply to Whitney Knitter

TI__Mastermind 36830 points

Hi Whitney, You can run evm-console-log.ipynb and see if there is any errors or messages there which can give us a clue of the issue.

However, my guess is that you probably haven't unzip the model. Probably also true for the pose estimation model you were trying. I will take a note to see if we can have unzip models in user's workspace by default.

For now, you can extract all the models by running extract.sh inside notebooks/prebuilt-models/8bits/

Or you can extract a particular model. Example below:

user@1c26caa7764f:/home/root/notebooks/prebuilt-models/8bits$ find . -name "*od-8220_onnxrt_coco_edgeai-mmdet_yolox_s_lite_640x640_20220221_model_onnx.tar.gz" -exec tar --one-top-level -zxvf "{}" \;

Another trick, for custom-models, is that you can convert them to python scripts and run them from terminal, to get more errors information, instead of just kernel died. Example below:

in terminal you can type bash

user@1c26caa7764f:/home/root/notebooks$ jupyter nbconvert --to script custom-model-onnx-yolox.ipynb
[NbConvertApp] Converting notebook custom-model-onnx-yolox.ipynb to script
[NbConvertApp] Writing 7765 bytes to custom-model-onnx-yolox.py
user@1c26caa7764f:/home/root/notebooks$ python3 custom-model-yolox.py

Thank you,

Paula

0 Whitney Knitter over 2 years ago in reply to Paula Carrillo

Prodigy 10 points

Hi Paula,

You are right, I forgot the models need to be unzipped every time a new EVM session is started.

I did also happen to find the issue causing my original notebook to hang and the kernel to die. I had a copy+paste error when setting the EP_list variable for the execution inference run after the calibration inference run. So I was trying to run calibration inference twice and recompile the model without realizing it, which causes some sort of memory allocation issue in Jupyter and kills the kernel.

So for the second inference run, the EP_list variable should have been: EP_list = ['TIDLExecutionProvider','CPUExecutionProvider']

But I still had it set to: EP_list = ['TIDLCompilationProvider','CPUExecutionProvider']

Although, I've hit a new issue. The skeleton keypoints are in the wrong place in the image (not on the person). I know the human pose yolox model can only handle one subject in the frame at a time, so I'm thinking the spot in the background the skeleton keypoints are being placed on must look like a person somehow (it's ceiling rafters in my basement haha). I'm going to try cropping this point in the image out and rerunning the model in the morning.

0 Paula Carrillo over 2 years ago in reply to Whitney Knitter

TI__Mastermind 36830 points

Hi Whitney, I am glad to hear that you are getting some progress =). I am checking with my team if we can host uncompressed model artifacts for the next release to avoid any issues.

One thing is that if accuracy is not OK, you can try first to run the model at 16bit (num_bits = 16) and see if this helps. If so, you can increase calibration_frames, maybe 15 or 25 something like that, and, calibration_iterations 25 probably is OK, we sometimes use 50.

thank you,

Paula

0 Whitney Knitter over 2 years ago in reply to Paula Carrillo

Prodigy 10 points

Hi Paula!

I reran calibration with the model at 16-bit and more sample images. The skeleton key points of the pose matches the subject in the photo, but instead of being placed over the subject, it ends up in the upper lefthand corner and is much smaller (see attached screenshot). Adding more calibration images doesn't seem to have any impact on this result. What would you recommend?

0 Paula Carrillo over 2 years ago in reply to Whitney Knitter

TI__Mastermind 36830 points

Hi Whitney,

you can try by commenting below lines.

# plot the outut using matplotlib
#plt.rcParams["figure.figsize"]=20,20
#plt.rcParams['figure.dpi'] = 200 # 200 e.g. is really fine, but slower

if it doesn't help, please take a look at single_img_visualise() inside workspace's notebooks/scripts/utils.py.

thank you,

Paula

0 Paula Carrillo over 2 years ago in reply to Paula Carrillo

TI__Mastermind 36830 points

Hi Withney, I saw your post in Hackster.oi (Practicing Yoga with AI: Human Pose Estimation on the TDA4VM - Hackster.io), congratulations, it is pretty nice! .. and ok, your issues with postprocessing was due the image size =), I will note it in case this issue arise again to someone else.

I will mark this as "TI thinks is resolved"

thank you,

Paula

Processors

Processors forum

TDA4VM: Inference with Custom Artifacts Kills Kernel in Edge AI Studio