SK-AM62A-LP: Problem running inference script after model artifact generation

Pragya Kapoor

Part Number: SK-AM62A-LP

Tool/software:

Hello, I generated model artifacts for an ONNX model(patchcore implementation) which uses resnet 50.

Now, whenever I run the following infernece script on the board, trying to use the TIDL accelaration:

I get this error:
2024-09-05 03:56:32.346482125 [W:onnxruntime:, execution_frame.cc:835 VerifyOutputSizes] Expected shape from model of {1,1024,14,14} does not match actual shape of {1,1,1,1024,14,14} for output input.332

I don't know why this happening because I know these extra dimensions are added when model artifacts are generated. I have a tflite model running with similar dimensions for a different application, I don't know what the issue is here.

import os
import cv2
import numpy as np
import torch
from torchvision import transforms
import onnxruntime as ort
import faiss
from PIL import Image
from scipy.ndimage import gaussian_filter
import gi
import time

gi.require_version('Gst', '1.0')
from gi.repository import Gst

# Import necessary components from train.py
from train import embedding_concat, reshape_embedding, min_max_norm, cvt2heatmap, heatmap_on_image, get_args

# Define transforms (ensure these match those used in train.py)
data_transforms = transforms.Compose([
    transforms.Resize((224, 224), Image.LANCZOS),  # Reduced resolution
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
inv_normalize = transforms.Normalize(
    mean=[-0.485 / 0.229, -0.456 / 0.224, -0.406 / 0.225],
    std=[1 / 0.229, 1 / 0.224, 1 / 0.225]
)

def gstreamer_pipeline():
    return (
        'v4l2src device=/dev/video3 io-mode=dmabuf-import ! '
        'video/x-bayer, width=640, height=480, framerate=15/1, format=rggb10 ! '
        'tiovxisp sink_0::device=/dev/v4l-subdev2 sensor-name="SENSOR_SONY_IMX219_RPI" '
        'dcc-isp-file=/opt/imaging/imx219/linear/dcc_viss_10b_640x480.bin sink_0::dcc-2a-file=/opt/imaging/imx219/linear/dcc_2a_10b_640x480.bin '
        '! video/x-raw, format=NV12, width=640, height=480, framerate=15/1 ! '
        'videoconvert ! video/x-raw, format=BGR ! appsink'
    )

def heatmap_on_image(heatmap, image, alpha=0.5, colormap=cv2.COLORMAP_JET):
    if heatmap.shape != image.shape:
        heatmap = cv2.resize(heatmap, (image.shape[1], image.shape[0]))
    heatmap = cv2.applyColorMap(np.uint8(heatmap), colormap)
    overlay = cv2.addWeighted(heatmap, alpha, image, 1 - alpha, 0)
    return overlay

def main():
    # Initialize GStreamer
    Gst.init(None)

    # Timing model load
    start_time = time.time()
    
    # Path to the ONNX model file
    onnx_model_path = '/opt/edgeai-tidl-artifacts/cl-ort-patchcore/patchcore_model.onnx'

    options = {
        'artifacts_folder': '/opt/edgeai-tidl-artifacts/cl-ort-patchcore'
    }

    so = ort.SessionOptions()
    
    # Specify execution providers with TIDL configuration
    ep_list = ['TIDLExecutionProvider', 'CPUExecutionProvider']
    
    # Load the ONNX model with TIDL acceleration
    ort_session = ort.InferenceSession(onnx_model_path, providers=ep_list, provider_options=[options, {}], sess_options=so)

    model_load_time = time.time() - start_time
    print(f"Model loading time: {model_load_time:.4f} seconds")

    # Get input and output details
    input_name = ort_session.get_inputs()[0].name
    output_names = [output.name for output in ort_session.get_outputs()]

    # Get arguments
    args = get_args()

    # Update the dataset path to your actual path on the board
    args.dataset_path = '/opt/edgeai-gst-apps/PatchCore_anomaly_detection'
    args.category = 'bottle'  # Ensure this is set to the correct category

    # Load the FAISS index
    start_time = time.time()
    index_path = os.path.join(args.dataset_path, 'embeddings', args.category, 'index.faiss')
    index = faiss.read_index(index_path)
    if torch.cuda.is_available():
        res = faiss.StandardGpuResources()
        index = faiss.index_cpu_to_gpu(res, 0, index)
    faiss_load_time = time.time() - start_time
    print(f"FAISS index loading time: {faiss_load_time:.4f} seconds")

    # Function to run inference on ONNX model
    def run_onnx_inference(ort_session, input_data):
        start_time = time.time()
        outputs = ort_session.run(output_names, {input_name: input_data})
        inference_time = time.time() - start_time
        print(f"Model inference time: {inference_time:.4f} seconds")
        return outputs

    # Initialize video capture with GStreamer pipeline
    cap = cv2.VideoCapture(gstreamer_pipeline(), cv2.CAP_GSTREAMER)

    if not cap.isOpened():
        print("Error: Unable to open video source.")
        return

    frame_count = 0
    total_processing_time = 0

    while cap.isOpened():
        start_time = time.time()
        ret, frame = cap.read()
        if not ret:
            break

        frame_read_time = time.time() - start_time
        print(f"Frame read time: {frame_read_time:.4f} seconds")

        # Preprocess frame
        start_time = time.time()
        pil_img = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
        input_tensor = data_transforms(pil_img).unsqueeze(0).numpy().astype(np.float32)
        preprocessing_time = time.time() - start_time
        print(f"Frame preprocessing time: {preprocessing_time:.4f} seconds")

        # Run ONNX inference
        start_time = time.time()
        features = run_onnx_inference(ort_session, input_tensor)

        inference_time = time.time() - start_time

        # Convert features to tensors
        start_time = time.time()
        features = [torch.tensor(f) for f in features]

        # Extract embeddings and perform the same steps as in the test_step
        embeddings = []
        for feature in features:
            m = torch.nn.AvgPool2d(3, 1, 1)
            embeddings.append(m(feature))
        embedding_ = embedding_concat(embeddings[0], embeddings[1])
        embedding_test = np.array(reshape_embedding(np.array(embedding_)))
        feature_extraction_time = time.time() - start_time
        print(f"Feature extraction time: {feature_extraction_time:.4f} seconds")

        # Search the FAISS index
        start_time = time.time()
        score_patches, _ = index.search(embedding_test, k=args.n_neighbors)
        faiss_search_time = time.time() - start_time
        print(f"FAISS search time: {faiss_search_time:.4f} seconds")

        # Postprocess anomaly map
        start_time = time.time()
        anomaly_map = score_patches[:, 0].reshape((28, 28))
        N_b = score_patches[np.argmax(score_patches[:, 0])]
        w = (1 - (np.max(np.exp(N_b)) / np.sum(np.exp(N_b))))
        score = w * max(score_patches[:, 0])  # Image-level score

        anomaly_map_resized = cv2.resize(anomaly_map, (224, 224))
        anomaly_map_resized_blur = gaussian_filter(anomaly_map_resized, sigma=2)  # Reduced sigma for faster processing

        anomaly_map_norm = min_max_norm(anomaly_map_resized_blur)
        anomaly_map_norm_hm = cvt2heatmap(anomaly_map_norm * 255)
        anomaly_map_norm_hm_resized = cv2.resize(anomaly_map_norm_hm, (frame.shape[1], frame.shape[0]))
        heatmap_overlay_time = time.time() - start_time
        print(f"Heatmap overlay time: {heatmap_overlay_time:.4f} seconds")

        hm_on_img = heatmap_on_image(anomaly_map_norm_hm_resized, frame, alpha=0.3)  # More transparent overlay
	        # Display result
        start_time = time.time()
        cv2.imshow('Anomaly Detection', hm_on_img)
        display_time = time.time() - start_time
        print(f"Display time: {display_time:.4f} seconds")

        frame_processing_time = time.time() - start_time
        total_processing_time += frame_processing_time
        frame_count += 1
        print(f"Frame processing time: {frame_processing_time:.4f} seconds")

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    cap.release()
    cv2.destroyAllWindows()

    average_processing_time = total_processing_time / frame_count if frame_count else 0
    print(f"Average frame processing time: {average_processing_time:.4f} seconds")

if __name__ == '__main__':
    main()

over 1 year ago

0 Reese Grimsley over 1 year ago

TI__Genius 16716 points

Hello,

I understand the warning here (which in this case, can be safely ignored but in some cases will produce error). Please set TIDL_RT_ONNX_VARDIM=1 in the calling Linux environment.

export TIDL_RT_ONNX_VARDIM=1

TIDL internally uses 6-dimensions on all tensors. This often implies that singleton axes are added to the underlying data object. ONNX complains about this. TFLite does not, for whatever reason. The environment variable I noted above will force the runtime to remove some of these singleton axes when passing back to ONNX.

This has used to be in the documentation for the osrt_python pages but has been removed. Let me see about fixing that to avoid confusion in the future

https://github.com/TexasInstruments/edgeai-tidl-tools/tree/master/examples/osrt_python
It's noted in osrt_cpp README but clearly isn't easy enough to find

BR,

Reese

0 Pragya Kapoor over 1 year ago in reply to Reese Grimsley

Prodigy 40 points

Hello,

Thanks a lot for your reply :)

I also wanted to ask that the model artifact generation script that is available on github, does it only generate artifacts for one output(final node)?

Because my script contains two outputs, one of them is from an intermediate layer.
So, when I run this line of code :

output_names = [output.name for output in ort_session.get_outputs()]

the output_names I get are ['input.128', 'input.332']

but when I run this line of code :

outputs = ort_session.run(output_names, {input_name: input_data})

The outputs I get are [None, array(valid)].

How can I generetae it for both these outputs? Thank you

0 Reese Grimsley over 1 year ago in reply to Pragya Kapoor

TI__Genius 16716 points

Hello,

Glad to help!

Pragya Kapoor said:
Because my script contains two outputs, one of them is from an intermediate layer.

I have seen issues with intermediate nodes being used as outputs. Is there any warning or indication of error during model import / compilation? you can get more verbose output logs during compilation by passing 'debug_level=2' within the set of delegate_options passed to the onnruntime.InferenceSession during initialization. If you are using edgeai-tidl-tools, you can also change in examples/osrt_python/common_utils.py

One workaround for this is to add a basic node (e.g. Identity) that effectively buffers the output you want from the intermediate node.

Hope this helps!

BR,

Reese

0 Pragya Kapoor over 1 year ago in reply to Reese Grimsley

Prodigy 40 points

Hello, thanks for the reply.

No, I don't get any error during model import/compilation.

I was trying to figure out where this problem is arising from and could it be from the onnx_model_opt.py file?

Thank you

0 Reese Grimsley over 1 year ago in reply to Pragya Kapoor

TI__Genius 16716 points

Hello,

No, this issue is more likely internal to the what the model compilation/import tool is doing. If you are okay with it, please share your artifacts directory.

You can check how many outputs TIDL is using from the onnxrtMetaData.txt. You can also visualize how the model was parsed by looking at the SVG files with the artifacts/tempDir subdirectory. If you open in a browser, hovering your mouse over nodes will show additional information. I would suggest doing this for the intermediate node you also want treated as an output.

Edit: this topic has been forked into the following thread. I am closing this topic and further responses will be in the new thread

-e2e.ti.com/.../sk-am62a-lp-problem-with-onnx-model-artifact-generation

-Reese

0 Pragya Kapoor over 1 year ago in reply to Reese Grimsley

Prodigy 40 points

Hello, thank you for your reply.

I have already checked the onnxrtMetaData.txt. and it is only taking one output for some reason.

Here is the link to the model artifacts folder : https://drive.google.com/drive/folders/18FYxpObW7Rp6MInwUvXbfr0VWB9e-RBj?usp=sharing

Thank you :)

0 Reese Grimsley over 1 year ago in reply to Pragya Kapoor

TI__Genius 16716 points

Response posted in following thread: https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1412653/sk-am62a-lp-problem-with-onnx-model-artifact-generation

Processors

Processors forum

SK-AM62A-LP: Problem running inference script after model artifact generation