Processors

Processors forum

State Resolved
Locked Locked
Replies 28 replies
Answers 1 answer
Subscribers 102 subscribers
Views 434 views
Users 0 members are here

Support feedback

Options

Options

Similar topics

SK-AM62A-LP: Problem with onnx model artifact generation

Pragya Kapoor

Prodigy 40 points

Part Number: SK-AM62A-LP

Tool/software:

Hello,

I wanted to ask that the model artifact generation script that is available on github(onnxrt_ep.py), does it only generate artifacts for one output(final node)?

Because my model contains two outputs, one of them is from an intermediate layer.
So, when I run this line of code :

output_names = [output.name for output in ort_session.get_outputs()]

the output_names I get are ['output', 'input.332']

but when I run this line of code :

outputs = ort_session.run(output_names, {input_name: input_data})

The outputs I get are [None, array(valid)].

How can I generetae it for both these outputs?

I have also attached an image to show the intermediate output.

Thank you

0 Reese Grimsley 2 months ago

TI__Genius 9316 points

Hi Pragya,

This is the same topic as we are discussing in the following thread, correct?

- https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1411919/sk-am62a-lp-problem-running-inference-script-after-model-artifact-generation

Let's shift the discussion of that to this thread since you have created a title directly related to the issue at hand. For convenience, I'm pasting my response below so we can continue the support topic here. I'll mark the other thread as solved.

"""

Hello,

No, this issue is more likely internal to the what the model compilation/import tool is doing. If you are okay with it, please share your artifacts directory.

You can check how many outputs TIDL is using from the onnxrtMetaData.txt. You can also visualize how the model was parsed by looking at the SVG files with the artifacts/tempDir subdirectory. If you open in a browser, hovering your mouse over nodes will show additional information. I would suggest doing this for the intermediate node you also want treated as an output.

"""

BR,

Reese

0 Reese Grimsley 2 months ago in reply to Reese Grimsley

TI__Genius 9316 points

Porting a follow-up note from Pragya on previous thread into this issue for better consistency:

"""

Hello, thank you for your reply.

I have already checked the onnxrtMetaData.txt. and it is only taking one output for some reason.

Here is the link to the model artifacts folder : https://drive.google.com/drive/folders/18FYxpObW7Rp6MInwUvXbfr0VWB9e-RBj?usp=sharing

Thank you :)

"""

0 Reese Grimsley 2 months ago in reply to Reese Grimsley

TI__Genius 9316 points

Thanks for supplying the model artifacts. That is helpful, and the issue is more clear to me now

Important: could you please include the SDK version you are using here? It should be clear from the git version tag in the edgeai-tidl-tools repo (which I assume you are using to compile). I estimate it is either 9.0 or 9.1. I would suggest at

I would consider this behavior a bug. In the runtimes_visualization.svg, I can see that it has parsed the output from an intermediate node in the graph. However, it is not actually a part of the subgraph that will run on the accelerator, so there's some disconnect.

Here's what I would suggest at this stage:

Upgrade to the latest SDK version for your edgeai-tidl-tools.
As a workaround, make a slight modification on the model such that there is a node acting as a buffer between the intermediate output and an actual output
- onnx-modifier is a decent tool for doing this graphically. Otherwise, ONNX graphs are fairly easy to modify in Python too
  - https://github.com/ZhangGe6/onnx-modifier
- The node to buffer should be something that is effectively no-op on the data. For example, adding 0, multiplying by 1, identity node, data shuffling with parameters that make no impact; perhaps a data-type conversion.
  - see supported ops page: github.com/.../supported_ops_rts_versions.md
I will log as a bug for development teams. I need confirmation of SDK version first
I can also look into adding a model optimization rule to tidl_onnx_model_optimizer that will automatically implement the workaround above
- https://github.com/TexasInstruments/edgeai-tidl-tools/tree/master/scripts/osrt_model_tools/onnx_tools/tidl_onnx_model_optimizer

BR,
Reese

0 Pragya Kapoor 2 months ago in reply to Reese Grimsley

Prodigy 40 points

Hello, thank you for your reply.

The SDK version I am using is indeed 9.1
Does this problem not occur with 9.2? I would be grateful if the rule can be added to tidl_onnx_model_optimizer.

In the meanwhile I will try to implement the workaround by myself.
Just a query, when I add this buffer node to between the two output nodes then at the end now I am only expecting a single final output instead of two right? because another issue I am facing if I try to add an identity node which has input node input.184 and output node input.332 is onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:Identity, node name: custom_added_Identity0): [ShapeInferenceError] Inferred shape and existing shape differ in dimension 1: (512) vs (1024)

Because input.184 has shape (1,512,28,28) and input.332 has shape (1,1024,14,14)

Thaks

0 Qutaiba Saleh 2 months ago in reply to Pragya Kapoor

TI__Intellectual 2650 points

Hi Pragya,

Reese is out of office this week. Please, expect delay int he response.

Best regards,

Qutaiba

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

Hi Pragya,

Thanks for the patience.

I would recommend upgrading to the latest SDK 10.0 since there are many bugfixes and improvements to robustness. I suggest trying your compilation again with the 10.0.0.6 tools

Your final model should have the number of outputs you originally wanted (2). The workaround is to buffer the actual outputs from *also* being the input to another node (inputs to the overall graph should remain untouched). I think we had a miscommunication -- let me describe for your scenario.

You should be adding some buffer-style node with:

input="input.184" and
output="input.184.BUFFERED" (name here doesn't matter here).
Ensure that the graph outputs are now "input.332" and "input.184.BUFFERED" (same name as above)

You can leave input.332 untouched. It is not being affected, so far as I can tell.

We're trying to workaround the fact that this accelerator is very intentional with memory and storing tensors. Any input/output must be mapped to DDR somewhere, but internal tensors for the graph are not restricted this way. If a tensor can exist in internal memory like cache, if will for performance benefit. By making a buffer node such that the output is not also an intermediate tensor, then it can be explicitly mapped as an output.

BR,

Reese

0 Pragya Kapoor 1 month ago in reply to Reese Grimsley

Prodigy 40 points

Hello, thanks for your reply. I did try to compile artifacts with SDK 9.2 and it gave me two outputs and correct model artifacts but I was unable to use them on my board (I got an error) because the SDK there is 9.1

Wouldn't I run into the same problem with SDK 10.0.0.6?

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

Hi Pragya,

Pragya Kapoor said:
I did try to compile artifacts with SDK 9.2 and it gave me two outputs and correct model artifacts

Does this run accurately enough in host emulation? You can run the same command as compilation, but without the -c tag. You may want to provide your own images and check the visualized outputs (or even insert some of your own postprocessing/accuracy checks --- depends on you rmodel. I want to be sure that the outputs in this 9.2 version are also correct.

- If this is true, then 9.2 and 10.0 do not require the workaround we discussed.

It is true that the artifacts are locked to the SDK version. The artifacts you generated with 9.1 tidl_tools will only work for 9.1 SDK. Is it acceptable to reflash your SD card / EVM with the 9.2 SDK? I would more strongly suggest upgrading all the way to 10.0 if possible.

BR,
Reese

0 Pragya Kapoor 1 month ago in reply to Reese Grimsley

Prodigy 40 points

Hi Reese, I just tried this workaround you suggested :

"You should be adding some buffer-style node with:

input="input.184" and
output="input.184.BUFFERED" (name here doesn't matter here).
Ensure that the graph outputs are now "input.332" and "input.184.BUFFERED" (same name as above)"

and while this does solve the multiple output issue and generates artifacs for both the outputs, it still does not work with my inference script. This is what I get:
(venv) root@am62axx-evm:/opt/edgeai-gst-apps/PatchCore_anomaly_detection# python live_detection_onnx_runtime
/opt/edgeai-gst-apps/PatchCore_anomaly_detection/live_detection_onnx_runtime:9: UserWarning: A NumPy version >=1.23.5 and <2.3.0 is required for this version of SciPy (detected version 1.23.0)
from scipy.ndimage import gaussian_filter
libtidl_onnxrt_EP loaded 0x47fa2130
Final number of subgraphs created are : 1, - Offloaded Nodes - 98, Total Nodes - 98
APP: Init ... !!!
MEM: Init ... !!!
MEM: Initialized DMA HEAP (fd=5) !!!
MEM: Init ... Done !!!
IPC: Init ... !!!
IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
8476.313055 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
8476.313197 s: VX_ZONE_INIT:Enabled
8476.313215 s: VX_ZONE_ERROR:Enabled
8476.313225 s: VX_ZONE_WARNING:Enabled
8476.314721 s: VX_ZONE_INIT:[tivxInitLocal:130] Initialization Done !!!
8476.314922 s: VX_ZONE_INIT:[tivxHostInitLocal:101] Initialization Done for HOST !!!
8476.357660 s: VX_ZONE_ERROR:[ownContextSendCmd:868] Command ack message returned failure cmd_status: -1
8476.357733 s: VX_ZONE_ERROR:[ownNodeKernelInit:584] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
8476.357745 s: VX_ZONE_ERROR:[ownNodeKernelInit:585] Please be sure the target callbacks have been registered for this core
8476.357757 s: VX_ZONE_ERROR:[ownNodeKernelInit:586] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
8476.357771 s: VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:2 ... failed !!!
8476.357790 s: VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
8476.357801 s: VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
TIDL_RT_OVX: ERROR: Verifying TIDL graph ... Failed !!!
TIDL_RT_OVX: ERROR: Verify OpenVX graph failed
Model loading time: 1.2428 seconds
FAISS index loading time: 0.0117 seconds
[ WARN:0@30.237] global /usr/src/debug/opencv/4.5.5-r0/git/modules/videoio/src/cap_gstreamer.cpp (1405) open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1
Frame read time: 0.0186 seconds
Frame preprocessing time: 0.0737 seconds
['input.332', 'add_output']
8476.695579 s: VX_ZONE_ERROR:[ownContextSendCmd:868] Command ack message returned failure cmd_status: -1
8476.695641 s: VX_ZONE_ERROR:[ownNodeKernelInit:584] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
8476.695654 s: VX_ZONE_ERROR:[ownNodeKernelInit:585] Please be sure the target callbacks h[ 8470.724416] audit: type=1701 audit(1725550658.412:35): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=2326 comm="pt_main_thread" exe="/usr/bin/python3.10" sig=4 res=1
ave been registered for this core
8476.695666 s: VX_ZONE_ERROR:[ownNodeKernelInit:586] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
8476.695680 s: VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:2 ... failed !!!
8476.695704 s: VX_ZONE_ERROR:[vxVerifyG[ 8470.773017] audit: type=1334 audit(1725550658.460:36): prog-id=21 op=LOAD
raph:2055] Node kernel init failed
8476.695715 s: VX_ZONE_ER[ 8470.784088] audit: type=1334 audit(1725550658.472:37): prog-id=22 op=LOAD
ROR:[vxVerifyGraph:2109] Graph verify failed
8476.695836 s: VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:812] graph is not in a state required to be scheduled
8476.695849 s: VX_ZONE_ERROR:[vxProcessGraph:747] schedule graph failed
8476.695859 s: VX_ZONE_ERROR:[vxProcessGraph:752] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
Model inference time: 0.0091 seconds
Shape of feature before pooling: torch.Size([1, 1024, 14, 14])
Shape of feature before pooling: torch.Size([1, 512, 28, 28])
[ 8596.690516] audit: type=1334 audit(1725550784.380:38): prog-id=22 op=UNLOAD
[ 8596.697572] audit: type=1334 audit(1725550784.380:39): prog-id=21 op=UNLOAD
Illegal instruction (core dumped)

I don't know why is says illegal instruction.
Here are the new generated artifacts : drive.google.com/.../18FYxpObW7Rp6MInwUvXbfr0VWB9e-RBj

and here is the inference script again for your reference :

import os
import cv2
import numpy as np
import torch
from torchvision import transforms
import onnxruntime as ort
import faiss
from PIL import Image
from scipy.ndimage import gaussian_filter
import gi
import time

gi.require_version('Gst', '1.0')
from gi.repository import Gst

# Import necessary components from train.py
from train import embedding_concat, reshape_embedding, min_max_norm, cvt2heatmap, heatmap_on_image, get_args

# Define transforms (ensure these match those used in train.py)
data_transforms = transforms.Compose([
    transforms.Resize((224, 224), Image.LANCZOS),  # Reduced resolution
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
inv_normalize = transforms.Normalize(
    mean=[-0.485 / 0.229, -0.456 / 0.224, -0.406 / 0.225],
    std=[1 / 0.229, 1 / 0.224, 1 / 0.225]
)

def gstreamer_pipeline():
    return (
        'v4l2src device=/dev/video3 io-mode=dmabuf-import ! '
        'video/x-bayer, width=640, height=480, framerate=15/1, format=rggb10 ! '
        'tiovxisp sink_0::device=/dev/v4l-subdev2 sensor-name="SENSOR_SONY_IMX219_RPI" '
        'dcc-isp-file=/opt/imaging/imx219/linear/dcc_viss_10b_640x480.bin sink_0::dcc-2a-file=/opt/imaging/imx219/linear/dcc_2a_10b_640x480.bin '
        '! video/x-raw, format=NV12, width=640, height=480, framerate=15/1 ! '
        'videoconvert ! video/x-raw, format=BGR ! appsink'
    )

def heatmap_on_image(heatmap, image, alpha=0.5, colormap=cv2.COLORMAP_JET):
    if heatmap.shape != image.shape:
        heatmap = cv2.resize(heatmap, (image.shape[1], image.shape[0]))
    heatmap = cv2.applyColorMap(np.uint8(heatmap), colormap)
    overlay = cv2.addWeighted(heatmap, alpha, image, 1 - alpha, 0)
    return overlay

def main():
    # Initialize GStreamer
    Gst.init(None)

    # Timing model load
    start_time = time.time()
    
    # Path to the ONNX model file
    onnx_model_path = '/opt/edgeai-tidl-artifacts/cl-ort-patchcore/patchcore_model.onnx'

    options = {
        'artifacts_folder': '/opt/edgeai-tidl-artifacts/cl-ort-patchcore'
    }

    so = ort.SessionOptions()
    
    # Specify execution providers with TIDL configuration
    ep_list = ['TIDLExecutionProvider', 'CPUExecutionProvider']
    
    # Load the ONNX model with TIDL acceleration
    ort_session = ort.InferenceSession(onnx_model_path, providers=ep_list, provider_options=[options, {}], sess_options=so)

    model_load_time = time.time() - start_time
    print(f"Model loading time: {model_load_time:.4f} seconds")

    # Get input and output details
    input_name = ort_session.get_inputs()[0].name
    output_names = [output.name for output in ort_session.get_outputs()]

    # Get arguments
    args = get_args()

    # Update the dataset path to your actual path on the board
    args.dataset_path = '/opt/edgeai-gst-apps/PatchCore_anomaly_detection'
    args.category = 'bottle'  # Ensure this is set to the correct category

    # Load the FAISS index
    start_time = time.time()
    index_path = os.path.join(args.dataset_path, 'embeddings', args.category, 'index.faiss')
    index = faiss.read_index(index_path)
    if torch.cuda.is_available():
        res = faiss.StandardGpuResources()
        index = faiss.index_cpu_to_gpu(res, 0, index)
    faiss_load_time = time.time() - start_time
    print(f"FAISS index loading time: {faiss_load_time:.4f} seconds")

    # Function to run inference on ONNX model
    def run_onnx_inference(ort_session, input_data):
        start_time = time.time()
        outputs = ort_session.run(output_names, {input_name: input_data})
        inference_time = time.time() - start_time
        print(f"Model inference time: {inference_time:.4f} seconds")
        return outputs

    # Initialize video capture with GStreamer pipeline
    cap = cv2.VideoCapture(gstreamer_pipeline(), cv2.CAP_GSTREAMER)

    if not cap.isOpened():
        print("Error: Unable to open video source.")
        return

    frame_count = 0
    total_processing_time = 0

    while cap.isOpened():
        start_time = time.time()
        ret, frame = cap.read()
        if not ret:
            break

        frame_read_time = time.time() - start_time
        print(f"Frame read time: {frame_read_time:.4f} seconds")

        # Preprocess frame
        start_time = time.time()
        pil_img = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
        input_tensor = data_transforms(pil_img).unsqueeze(0).numpy().astype(np.float32)
        preprocessing_time = time.time() - start_time
        print(f"Frame preprocessing time: {preprocessing_time:.4f} seconds")

        # Run ONNX inference
        start_time = time.time()
        features = run_onnx_inference(ort_session, input_tensor)

        inference_time = time.time() - start_time

        # Convert features to tensors
        start_time = time.time()
        features = [torch.tensor(f) for f in features]

        # Extract embeddings and perform the same steps as in the test_step
        embeddings = []
        for feature in features:
            m = torch.nn.AvgPool2d(3, 1, 1)
            embeddings.append(m(feature))
        embedding_ = embedding_concat(embeddings[0], embeddings[1])
        embedding_test = np.array(reshape_embedding(np.array(embedding_)))
        feature_extraction_time = time.time() - start_time
        print(f"Feature extraction time: {feature_extraction_time:.4f} seconds")

        # Search the FAISS index
        start_time = time.time()
        score_patches, _ = index.search(embedding_test, k=args.n_neighbors)
        faiss_search_time = time.time() - start_time
        print(f"FAISS search time: {faiss_search_time:.4f} seconds")

        # Postprocess anomaly map
        start_time = time.time()
        anomaly_map = score_patches[:, 0].reshape((28, 28))
        N_b = score_patches[np.argmax(score_patches[:, 0])]
        w = (1 - (np.max(np.exp(N_b)) / np.sum(np.exp(N_b))))
        score = w * max(score_patches[:, 0])  # Image-level score

        anomaly_map_resized = cv2.resize(anomaly_map, (224, 224))
        anomaly_map_resized_blur = gaussian_filter(anomaly_map_resized, sigma=2)  # Reduced sigma for faster processing

        anomaly_map_norm = min_max_norm(anomaly_map_resized_blur)
        anomaly_map_norm_hm = cvt2heatmap(anomaly_map_norm * 255)
        anomaly_map_norm_hm_resized = cv2.resize(anomaly_map_norm_hm, (frame.shape[1], frame.shape[0]))
        heatmap_overlay_time = time.time() - start_time
        print(f"Heatmap overlay time: {heatmap_overlay_time:.4f} seconds")

        hm_on_img = heatmap_on_image(anomaly_map_norm_hm_resized, frame, alpha=0.3)  # More transparent overlay
	        # Display result
        start_time = time.time()
        cv2.imshow('Anomaly Detection', hm_on_img)
        display_time = time.time() - start_time
        print(f"Display time: {display_time:.4f} seconds")

        frame_processing_time = time.time() - start_time
        total_processing_time += frame_processing_time
        frame_count += 1
        print(f"Frame processing time: {frame_processing_time:.4f} seconds")

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    cap.release()
    cv2.destroyAllWindows()

    average_processing_time = total_processing_time / frame_count if frame_count else 0
    print(f"Average frame processing time: {average_processing_time:.4f} seconds")

if __name__ == '__main__':
    main()

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

Hi Pragya,

Pragya Kapoor said:
8476.357660 s: VX_ZONE_ERROR:[ownContextSendCmd:868] Command ack message returned failure cmd_status: -1
8476.357733 s: VX_ZONE_ERROR:[ownNodeKernelInit:584] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
8476.357745 s: VX_ZONE_ERROR:[ownNodeKernelInit:585] Please be sure the target callbacks have been registered for this core
8476.357757 s: VX_ZONE_ERROR:[ownNodeKernelInit:586] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
8476.357771 s: VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:2 ... failed !!!
8476.357790 s: VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
8476.357801 s: VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed

Okay, so the model failed to initialize. I would first suggest resetting the EVM. If there was a previous failure, then some of the remote cores may be in an unstable state.

If it persists, please run `/opt/vx_app_arm_remote_log.out &` in the background, and retry your application. I'd like to see the log in that case.

But if it's fine to upgrade SDK's, then I'd suggest we table the model workaround and go off of a software version that has resolved this issue already.

BR,

Reese

0 Pragya Kapoor 1 month ago in reply to Reese Grimsley

Prodigy 40 points

These are the logs I got:
(venv) root@am62axx-evm:/opt/edgeai-gst-apps/PatchCore_anomaly_detection# python live_detection_onnx_runtime
/opt/edgeai-gst-apps/PatchCore_anomaly_detection/live_detection_onnx_runtime:9: UserWarning: A NumPy version >=1.23.5 and <2.3.0 is required for this version of SciPy (detected version 1.23.0)
from scipy.ndimage import gaussian_filter
libtidl_onnxrt_EP loaded 0x149a8970
Final number of subgraphs created are : 1, - Offloaded Nodes - 98, Total Nodes - 98
APP: Init ... !!!
MEM: Init ... !!!
MEM: Initialized DMA HEAP (fd=5) !!!
MEM: Init ... Done !!!
IPC: Init ... !!!
IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
966.866539 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
966.866718 s: VX_ZONE_INIT:Enabled
966.866731 s: VX_ZONE_ERROR:Enabled
966.866741 s: VX_ZONE_WARNING:Enabled
966.868125 s: VX_ZONE_INIT:[tivxInitLocal:130] Initialization Done !!!
966.868318 s: VX_ZONE_INIT:[tivxHostInitLocal:101] Initialization Done for HOST !!!
966.912100 s: VX_ZONE_ERROR:[ownContextSendCmd:868] Command ack message returned failure cmd_status: -1
966.912174 s: VX_ZONE_ERROR:[ownNodeKernelInit:584] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
966.912187 s: VX_ZONE_ERROR:[ownNodeKernelInit:585] Please be sure the target callbacks have been registered for this core
966.912199 s: VX_ZONE_ERROR:[ownNodeKernelInit:586] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
966.912215 s: VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:2 ... failed !!!
966.912233 s: VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
966.912244 s: VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
TIDL_RT_OVX: ERROR: Verifying TIDL graph ... Failed !!!
TIDL_RT_OVX: ERROR: Verify OpenVX graph failed
Model loading time: 1.2135 seconds
[C7x_1 ] 966.911834 s: VX_ZONE_ERROR:[tivxAlgiVisionAllocMem:194] Failed to Allocate memory record 5 @ space = 17 and size = 26315776 !!!
[C7x_1 ] 966.911870 s: VX_ZONE_ERROR:[tivxAlgiVisionCreate:358] tivxAlgiVisionAllocMem Failed
[C7x_1 ] 966.911901 s: VX_ZONE_ERROR:[tivxKernelTIDLCreate:912] tivxAlgiVisionCreate returned NULL
FAISS index loading time: 0.0094 seconds
[ WARN:0@30.969] global /usr/src/debug/opencv/4.5.5-r0/git/modules/videoio/src/cap_gstreamer.cpp (1405) open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1
Frame read time: 0.0199 seconds
Frame preprocessing time: 0.0794 seconds
967.244701 s: VX_ZONE_ERROR:[ownContextSendCmd:868] Command ack message returned failure cmd_status: -1
967.244765 s: VX_ZONE_ERROR:[ownNodeKernelInit:584] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
967.244778 s: VX_ZONE_ERROR:[ownNodeKernelInit:585] Please be sure the target callbacks h[ 961.278137] audit: type=1701 audit(1725554704.148:30): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=1591 comm="pt_main_thread" exe="/usr/bin/python3.10" sig=4 res=1
ave been registered for this core
967.244790 s: VX_ZONE_ERROR:[ownNodeKernelInit:586] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
967.244805 s: VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:2 ... failed !!!
967.244824 s: VX_ZONE_ERROR:[vxVerifyG[ 961.327236] audit: type=1334 audit(1725554704.196:31): prog-id=19 op=LOAD
raph:2055] Node kernel init failed
967.244835 s: VX_ZONE_ER[ 961.337610] audit: type=1334 audit(1725554704.208:32): prog-id=20 op=LOAD
ROR:[vxVerifyGraph:2109] Graph verify failed
967.244952 s: VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:812] graph is not in a state required to be scheduled
967.244964 s: VX_ZONE_ERROR:[vxProcessGraph:747] schedule graph failed
967.244974 s: VX_ZONE_ERROR:[vxProcessGraph:752] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
Model inference time: 0.0090 seconds
[C7x_1 ] 967.244438 s: VX_ZONE_ERROR:[tivxAlgiVisionAllocMem:194] Failed to Allocate memory record 5 @ space = 17 and size = 26315776 !!!
[C7x_1 ] 967.244474 s: VX_ZONE_ERROR:[tivxAlgiVisionCreate:358] tivxAlgiVisionAllocMem Failed
[C7x_1 ] 967.244516 s: VX_ZONE_ERROR:[tivxKernelTIDLCreate:912] tivxAlgiVisionCreate returned NULL
Shape of feature before pooling: torch.Size([1, 1024, 14, 14])
Shape of feature before pooling: torch.Size([1, 512, 28, 28])
[ 1093.981781] audit: type=1334 audit(1725554836.852:33): prog-id=20 op=UNLOAD
[ 1093.988863] audit: type=1334 audit(1725554836.852:34): prog-id=19 op=UNLOAD
Illegal instruction (core dumped)

And no, I don't think upgrading the software would be the best choice for me right now. thank you

0 Pragya Kapoor 1 month ago in reply to Pragya Kapoor

Prodigy 40 points

Hello, I think I figured out what the issue was. The features I was getting were in the order : [input.332, input.184.buffered] but expected was [input.184.buffered, input.332]. So, after swapping features positions, now the artifacts are working alright with the workaround you suggested.

But, while the model inference time has now reduced after being on the accelerator, the search algo I am using that is the FAISS is pretty slow and causing a bottleneck in my inference script. Is there any way that I can deploy the FAISS to the accelerator as well? Thank you for all your help :)

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

Hi Pragya,

To be clear, you resolved the error that was showing up in the previous message, correct?

Pragya Kapoor said:
[C7x_1 ] 966.911834 s: VX_ZONE_ERROR:[tivxAlgiVisionAllocMem:194] Failed to Allocate memory record 5 @ space = 17 and size = 26315776 !!!
[C7x_1 ] 966.911870 s: VX_ZONE_ERROR:[tivxAlgiVisionCreate:358] tivxAlgiVisionAllocMem Failed
[C7x_1 ] 966.911901 s: VX_ZONE_ERROR:[tivxKernelTIDLCreate:912] tivxAlgiVisionCreate returned NULL

If that's fixed with your model update, then great progress!

Pragya Kapoor said:
search algo I am using that is the FAISS

I'm not familiar with this algorithm, so it's hard to say. We do not currently support general purpose programming of the C7x DSP, so if you wanted an algorithm like this to run on the accelerator, it would need to be composed of supported NN operators. I estimate that is not true, or at least nontrivial. Depending on how much speedup you need, this may require a creative solution.

Best Regards,
Reese

0 Pragya Kapoor 1 month ago in reply to Reese Grimsley

Prodigy 40 points

Yes, I resolved that issue. But I don't know why when I use the model with the artifacts(basically with TIDLExecutionProvider to deploy it on the accelarator), it gives me the same feature vecor as output for every frame which is weird because the model in isolation(when running on just CPUExecutionProvider) does not do that.

This makes me feel that the generated artifacts could still be problematic.

These are my artifacts : drive.google.com/.../18FYxpObW7Rp6MInwUvXbfr0VWB9e-RBj

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

Hi Pragya,

Pragya Kapoor said:
it gives me the same feature vecor as output for every frame

That is strange. This makes me think that a fault happened earlier, and that the model isn't truly running. I have seen this before where a model seems to run without issue but the accelerator has actually fallen into an unstable state from a previous error. In this case, a chunk of data placed at the output's location in memory is being read every time, and is not impacted by the network (seemingly) running.

I assume you are changing the input -- otherwise the same input should always produce the same output.

I'll take a look at the artifacts too. I'd suggest:

Restart the EVM so SW is in a fresh state
run `/opt/vx_app_arm_remote_log.out &` in the background
run `export TIDL_RT_DEBUG=1`
Run your script.
Save the output and share the log here.

At that point, is the error still persisting?

BR,
Reese

+1 Pragya Kapoor 1 month ago in reply to Reese Grimsley

Prodigy 40 points

Hello, PFA the log file.tidl_debug_log.txt

for some reason, I also got these errors while running my script(but I do not get them everytime, just in some runs) :
Final number of subgraphs created are : 1, - Offloaded Nodes - 98, Total Nodes - 98
APP: Init ... !!!
MEM: Init ... !!!
MEM: Initialized DMA HEAP (fd=5) !!!
MEM: Init ... Done !!!
IPC: Init ... !!!
IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
1883.029103 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
1883.029326 s: VX_ZONE_INIT:Enabled
1883.029355 s: VX_ZONE_ERROR:Enabled
1883.029365 s: VX_ZONE_WARNING:Enabled
1883.030861 s: VX_ZONE_INIT:[tivxInitLocal:130] Initialization Done !!!
1883.031096 s: VX_ZONE_INIT:[tivxHostInitLocal:101] Initialization Done for HOST !!!
1883.074069 s: VX_ZONE_ERROR:[ownContextSendCmd:868] Command ack message returned failure cmd_status: -1
1883.074115 s: VX_ZONE_ERROR:[ownNodeKernelInit:584] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
1883.074139 s: VX_ZONE_ERROR:[ownNodeKernelInit:585] Please be sure the target callbacks have been registered for this core
1883.074150 s: VX_ZONE_ERROR:[ownNodeKernelInit:586] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
1883.074165 s: VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:2 ... failed !!!
1883.074183 s: VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
1883.074194 s: VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
TIDL_RT_OVX: ERROR: Verifying TIDL graph ... Failed !!!
TIDL_RT_OVX: ERROR: Verify OpenVX graph failed
[ WARN:0@30.965] global /usr/src/debug/opencv/4.5.5-r0/git/modules/videoio/src/cap_gstreamer.cpp (1405) open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1
1883.629457 s: VX_ZONE_ERROR:[ownContextSendCmd:868] Command ack message returned failure cmd_status: -1
1883.629502 s: VX_ZONE_ERROR:[ownNodeKernelInit:584] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
1883.629526 s: VX_ZONE_ERROR:[ownNodeKernelInit:585] Please be sure the target callbacks have been registered for this core
1883.629538 s: VX_ZONE_ERROR:[ownNodeKernelInit:586] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
1883.629553 s: VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:2 ... failed !!!
1883.629572 s: VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
1883.629583 s: VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
1883.629701 s: VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:812] graph is not in a state required to be scheduled
1883.629713 s: VX_ZONE_ERROR:[vxProcessGraph:747] schedule graph failed
1883.629724 s: VX_ZONE_ERROR:[vxProcessGraph:752] wait graph failed
ERROR: Running TIDL graph ... Failed !!!
/opt/edgeai-gst-apps/PatchCore_anomaly_detection/live_detection_gui_train:323: RuntimeWarning: overflow encountered in exp
w = (1 - (np.max(np.exp(N_b)) / np.sum(np.exp(N_b))))
/opt/edgeai-gst-apps/PatchCore_anomaly_detection/live_detection_gui_train:323: RuntimeWarning: invalid value encountered in float_scalars
w = (1 - (np.max(np.exp(N_b)) / np.sum(np.exp(N_b))))
/opt/edgeai-gst-apps/PatchCore_anomaly_detection/train.py:181: RuntimeWarning: invalid value encountered in divide
return (image-a_min)/(a_max - a_min)
1883.821143 s: VX_ZONE_ERROR:[ownContextSendCmd:868] Command ack message returned failure cmd_status: -1
1883.821219 s: VX_ZONE_ERROR:[ownNodeKernelInit:584] Target kernel, TIVX_CMD_NODE_CREATE failed for node TIDLNode
1883.821262 s: VX_ZONE_ERROR:[ownNodeKernelInit:585] Please be sure the target callbacks have been registered for this core
1883.821275 s: VX_ZONE_ERROR:[ownNodeKernelInit:586] If the target callbacks have been registered, please ensure no errors are occurring within the create callback of this kernel
1883.821289 s: VX_ZONE_ERROR:[ownGraphNodeKernelInit:583] kernel init for node 0, kernel com.ti.tidl:1:2 ... failed !!!
1883.821309 s: VX_ZONE_ERROR:[vxVerifyGraph:2055] Node kernel init failed
1883.821320 s: VX_ZONE_ERROR:[vxVerifyGraph:2109] Graph verify failed
1883.821439 s: VX_ZONE_ERROR:[ownGraphScheduleGraphWrapper:812] graph is not in a state required to be scheduled
1883.821451 s: VX_ZONE_ERROR:[vxProcessGraph:747] schedule graph failed
1883.821462 s: VX_ZONE_ERROR:[vxProcessGraph:752] wait graph failed
ERROR: Running TIDL graph ... Failed !!!

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

Hi Pragya,

I'm going to load up the model artifacts and see what I can learn.

First reaction on seeing the logs is that there's some fail-silent behavior going on -- the model initialized the first time and seems to run well, but there is actually some fault that causes the model to not truly complete, and it just keeps replaying the same output tensor. Then when it tries to initialize the model the second time the application is run, the error is no longer silent, and it fails to setup the model on the accelerator. I have seen this behavior in the past, and I'll see if that's what we're running into here.

Do you happen to have a version of your compilation log saved? Sometimes that completes but displays warnings / errors indicating that there may be a failure later on the device. This seems more likely to happen when there are fairly large tensors (relative to the size of the 224kB L2 cache). If you don't have the log on hand, could you rerun compilation with debug_level=2?

Edit: I think I have an older version of the artifacts. I can't reproduce the error with these, but I also notice the model isn't giving output for both network outputs. This network shows 97 nodes as opposed to the 98 shown in your situation.

NB: this was still informative. The output is not static between different frames and iterations, nor is the model initialization unstable on this version of the model. I am worried that the workaround to force input.184 to be output squashed one bug and produced another. What type of node did you use to buffer this intermediate output from a network output? Identity?

Edit 2: Forgive me, I downloaded an older set of artifacts from an earlier link in this thread. I pulled one that has the add_output included as well, which I see is just adding 0 to the previous tensor.

This is working on my side -- I'm not seeing errors related to starting the model nor do I see static output feature maps. The model runs consistently in ~75ms. Outputs are deterministic for the same input and different for different inputs. Could you share the snippet of your application where you are configuring the runtime via ONNX and then calling it?

BR,
Reese

0 Pragya Kapoor 1 month ago in reply to Reese Grimsley

Prodigy 40 points

Sure, here is the snippet of my application: (I have also attached the entire script code)

def gstreamer_pipeline():
return (
'v4l2src device=/dev/video3 io-mode=dmabuf-import ! '
'video/x-bayer, width=640, height=480, framerate=15/1, format=rggb10 ! '
'tiovxisp sink_0::device=/dev/v4l-subdev2 sensor-name="SENSOR_SONY_IMX21
'dcc-isp-file=/opt/imaging/imx219/linear/dcc_viss_10b_640x480.bin sink_0
'video/x-raw, format=NV12, width=640, height=480, framerate=15/1 ! video
)

def main():
# Start timing the model loading process
start_time = time.time()

# Load ONNX model
onnx_model_path = '/opt/edgeai-tidl-artifacts/cl-ort-patchcore/patchcore_model_buffered.onnx'
options = {

import cv2
import numpy as np
import os
import torch
from torchvision import transforms
import onnxruntime
import faiss
from PIL import Image
from scipy.ndimage import gaussian_filter
import time

# Import necessary components from train.py
from train import embedding_concat, reshape_embedding, min_max_norm, cvt2heatmap, heatmap_on_image, get_args

# Define transforms (ensure these match those used in train.py)
data_transforms = transforms.Compose([
    transforms.Resize((256, 256), Image.LANCZOS),
    transforms.ToTensor(),
    transforms.CenterCrop(224),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

inv_normalize = transforms.Normalize(
    mean=[-0.485 / 0.229, -0.456 / 0.224, -0.406 / 0.225],
    std=[1 / 0.229, 1 / 0.224, 1 / 0.225]
)

# GStreamer pipeline function for video capture
def gstreamer_pipeline():
    return (
        'v4l2src device=/dev/video3 io-mode=dmabuf-import ! '
        'video/x-bayer, width=640, height=480, framerate=15/1, format=rggb10 ! '
        'tiovxisp sink_0::device=/dev/v4l-subdev2 sensor-name="SENSOR_SONY_IMX219_RPI" '
        'dcc-isp-file=/opt/imaging/imx219/linear/dcc_viss_10b_640x480.bin sink_0::dcc-2a-file=/opt/imaging/imx219/linear/dcc_2a_10b_640x480.bin format-msb=9 ! '
        'video/x-raw, format=NV12, width=640, height=480, framerate=15/1 ! videoconvert ! video/x-raw, format=BGR ! appsink'
    )


def main():
    # Start timing the model loading process
    start_time = time.time()

    # Load ONNX model
    onnx_model_path = '/opt/edgeai-tidl-artifacts/cl-ort-patchcore/patchcore_model_buffered.onnx'
    options = {
        'artifacts_folder': '/opt/edgeai-tidl-artifacts/cl-ort-patchcore'
    }
    so = onnxruntime.SessionOptions()
    onnx_session = onnxruntime.InferenceSession(onnx_model_path, providers=['TIDLExecutionProvider','CPUExecutionProvider'],provider_options=[options, {}], sess_options=so)

    model_load_time = time.time() - start_time
    print(f"Model loading time: {model_load_time:.4f} seconds")

    # Get arguments
    args = get_args()

    # Update the dataset path to your actual path
    args.dataset_path = r'C:\Users\Pragya Kapoor\Documents\mvtec\data'

    # Load the FAISS index
    start_time = time.time()
    index_path = os.path.join('./embeddings', args.category, 'index.faiss')
    index = faiss.read_index(index_path)
    if torch.cuda.is_available():
        res = faiss.StandardGpuResources()
        index = faiss.index_cpu_to_gpu(res, 0, index)
    faiss_load_time = time.time() - start_time
    print(f"FAISS index loading time: {faiss_load_time:.4f} seconds")

    # Function to run inference on ONNX model
    def run_onnx_inference(onnx_session, input_data):
        start_time = time.time()
        input_name = onnx_session.get_inputs()[0].name
        outputs = onnx_session.run(None, {input_name: input_data})
        print(len(outputs))
        inference_time = time.time() - start_time
        print(f"Model inference time: {inference_time:.4f} seconds")
        return outputs

    # Initialize video capture using GStreamer pipeline
    cap = cv2.VideoCapture(gstreamer_pipeline(), cv2.CAP_GSTREAMER)

    if not cap.isOpened():
        print("Error: Unable to open video source.")
        return

    frame_count = 0
    total_processing_time = 0

    while cap.isOpened():
        start_time = time.time()
        ret, frame = cap.read()
        if not ret:
            break

        frame_read_time = time.time() - start_time
        print(f"Frame read time: {frame_read_time:.4f} seconds")

        # Preprocess frame
        start_time = time.time()
        pil_img = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
        input_tensor = data_transforms(pil_img).unsqueeze(0).numpy()
        preprocessing_time = time.time() - start_time
        print(f"Frame preprocessing time: {preprocessing_time:.4f} seconds")

        # Run ONNX inference
        start_time = time.time()
        features = run_onnx_inference(onnx_session, input_tensor)
        features = features[1:]
        features = [features[1],features[0]]
        inference_time = time.time() - start_time

        # Convert features to tensors
        start_time = time.time()
        features = [torch.tensor(f) for f in features]
        print(features[0].shape, features[1].shape)

        # Extract embeddings and perform the same steps as in the test_step
        embeddings = []
        for feature in features:
            m = torch.nn.AvgPool2d(3, 1, 1)
            embeddings.append(m(feature))
        embedding_ = embedding_concat(embeddings[0], embeddings[1])
        embedding_test = np.array(reshape_embedding(np.array(embedding_)))
        feature_extraction_time = time.time() - start_time
        print(f"Feature extraction time: {feature_extraction_time:.4f} seconds")

        # Search the FAISS index
        start_time = time.time()
        score_patches, _ = index.search(embedding_test, k=args.n_neighbors)
        print(f"Score patches: {score_patches}")

        faiss_search_time = time.time() - start_time
        print(f"FAISS search time: {faiss_search_time:.4f} seconds")

        start_time = time.time()
        anomaly_map = score_patches[:, 0].reshape((28, 28))
        N_b = score_patches[np.argmax(score_patches[:, 0])]
        N_b_max = np.max(N_b)
        w = (1 - (np.max(np.exp(N_b - N_b_max)) / np.sum(np.exp(N_b - N_b_max))))

        score = w * max(score_patches[:, 0])  # Image-level score

        # Postprocess anomaly map
        anomaly_map_resized = cv2.resize(anomaly_map, (args.input_size, args.input_size))
        anomaly_map_resized_blur = gaussian_filter(anomaly_map_resized, sigma=4)

        anomaly_map_norm = min_max_norm(anomaly_map_resized_blur)
        anomaly_map_norm_hm = cvt2heatmap(anomaly_map_norm * 255)

        # Resize heatmap to match frame size
        anomaly_map_norm_hm_resized = cv2.resize(anomaly_map_norm_hm, (frame.shape[1], frame.shape[0]))
        heatmap_overlay_time = time.time() - start_time
        print(f"Heatmap overlay time: {heatmap_overlay_time:.4f} seconds")

        hm_on_img = heatmap_on_image(anomaly_map_norm_hm_resized, frame)

        # Display the anomaly score on the frame
        start_time = time.time()
        cv2.putText(hm_on_img, f'Anomaly Score: {score:.2f}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2,
                    cv2.LINE_AA)

        # Display result
        cv2.imshow('Anomaly Detection', hm_on_img)
        display_time = time.time() - start_time
        print(f"Display time: {display_time:.4f} seconds")

        frame_processing_time = time.time() - start_time
        total_processing_time += frame_processing_time
        frame_count += 1
        print(f"Frame processing time: {frame_processing_time:.4f} seconds")

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    cap.release()
    cv2.destroyAllWindows()

    average_processing_time = total_processing_time / frame_count if frame_count else 0
    print(f"Average frame processing time: {average_processing_time:.4f} seconds")


if __name__ == '__main__':
    main()

'artifacts_folder': '/opt/edgeai-tidl-artifacts/cl-ort-patchcore'
}
so = onnxruntime.SessionOptions()
onnx_session = onnxruntime.InferenceSession(onnx_model_path, providers=['TIDLExecutionProvider','CPUExecutionProvider'], provider_options=[options, {}], sess_options=so)

def run_onnx_inference(onnx_session, input_data):
start_time = time.time()
input_name = onnx_session.get_inputs()[0].name
outputs = onnx_session.run(None, {input_name: input_data})
print(len(outputs))
inference_time = time.time() - start_time
print(f"Model inference time: {inference_time:.4f} seconds")
return outputs

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

Hi Pragya,

I see you marked one of the previous responses as resolved -- was this intentional? If so, please let me know what the resolution was! This is helpful for others who may find this thread in the future.

I don't see anything suspicious in your source code -- it all looks ordinary to me.. I'm not sure why you're running into this issue

Just to be sure I'm looking at the same artifacts that are giving you an error this is the md5sum of the artifacts I tested:

root@am62axx-evm:/PATH/TO/ARTIFACTS# md5sum ./* 

1d3b8df05a64f33767cab9e047018b9d  ./input.332_tidl_io_1.bin #paired to patchcore_model.onnx
bfba552f3a84c0a45391bd7c52afeb95  ./input.332_tidl_net.bin  #paired to patchcore_model.onnx
0132a5c35d26e1b96532a9872bbd2f35  ./input.332add_output_tidl_io_1.bin #paired to patchcore_model_buffered.onnx
fd006769ac21f6d5513463ff4d1d40d1  ./input.332add_output_tidl_net.bin #paired to patchcore_model_buffered.onnx
9c4a416304bc47eb5b19658aa1dcf952  ./onnxrtMetaData.txt
06b0dba6ba38a0d1df8a7a06992664eb  ./patchcore_model.onnx
fb013a94347d88222d2a0d1b062d2f60  ./patchcore_model_buffered.onnx

Can you do a quick version check on the SDK? It should look like the following:

root@am62axx-evm:~/model-test# echo $EDGEAI_SDK_VERSION 
09_01_00
root@am62axx-evm:~/model-test# echo $EDGEAI_VERSION     
9.1

Is this running on the SK-AM62A-LP EVM board? Has there been any change to memory maps (DDR regions designated for specific purposes within the software stack)? I assume not for the memory map, since this is a rather invasive change.

BR,
Reese

0 Pragya Kapoor 1 month ago in reply to Reese Grimsley

Prodigy 40 points

hello, I think I marked it resolved by mistake.

and the versions are the same, I just checked.
I am also using the same artifacts. which model are you using the patchcore_model or patchcore_model_buffered?

could you please also check these artifacts : drive.google.com/.../1Q2eeYcSXPrTiuhl4a11NkiXNDSeIcwOg
they only contain one model.
yes, it is running on the SK-AM62A-LP EVM board and no changes have been made to memory maps.
and yes the problem is still existing, I don't know why.

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

HI Pragya, no problem.

I checked that this is using the patchcore_model_buffered version. I get 3 outputs from this model, but one of them resolves as "None" through Python APIs. It looks like the buffered output is present, and this 'None' output is the original input.184

Let me see if I can replicate with this model -- I needed to request access, so you should see an email. I will also pass along my model testing script in a ZIP file. I will update this later today.

0 Reese Grimsley 1 month ago in reply to Reese Grimsley

TI__Genius 9316 points

Quick update:

I'm seeing the same behavior as before on my side, at least with my typical model test scripts. Have you tried running on static image files first? I find this to be a helpful early step if the model is giving issues. The rest of the pipeline can be added once we're sure the model is running consistently.

Please find those scripts within the attachment - you should untar these within the target filesystem. You may need to run the param-yaml-fixer.py script (single arg pointed at the directory with your param.yaml file) before using the model_speed_test.py script. This fixes some format inconsistencies that I've run into across a few different SDK versions. I run the tester script as follows:

python3 model_speed_test.py /PATH/TO/ARTIFACTS/ -n 2 -d 2 -p

This runs 2 iterations (-n) with debug_level 2 (-d), and prints the output (-p) of each inference. It uses a static image file -- I modify the "infile" variable within the script a few times to make sure it's not giving the same output for different inputs.

/cfs-file/__key/communityserver-discussions-components-files/791/model_5F00_tester.tar.gz

This is a strange issue -- it's usually easy to reproduce something like this

Edit: I see in your logs/application that you're printing the score patches but not the output feature maps from the model. Can you verify that those are static as well? Let's also make sure the inputs to the model is not the same either; said another way, ensure the frame captured by cv2 looks correct / isn't the same frame each time. It looks like you added a few other libraries like torch and scipy. It would also be good to assert that those functions are resulting in correct outputs off of a known input.

BR,
Reese

0 Pragya Kapoor 1 month ago in reply to Reese Grimsley

Prodigy 40 points

Hello, I ran your scriot and I have attached the output log for the same, it is called model_output.txt

I also ran my script live_detection.py the code for which I provided earlier and I have attached the log files(with input_tensor, feature_vector and score_patches) for both with accelerator and without accelerator.

script_without_acc.txt script_with_acc.txt

model_output.txt

(1, 3, 224, 224)
[None, array([[[[ 69.809425,  85.32263 ,  69.809425, ..., 201.67168 ,
          209.42827 , 108.59244 ],
         [ 46.539616,  77.566025,  62.052822, ..., 232.69809 ,
          232.69809 ,  93.07923 ],
         [ 15.513206,  62.052822,  77.566025, ..., 201.67168 ,
          217.18488 , 100.83584 ],
         ...,
         [ 62.052822, 193.91507 , 209.42827 , ..., 255.9679  ,
          232.69809 ,  69.809425],
         [ 77.566025, 224.94148 , 240.45468 , ..., 217.18488 ,
          201.67168 ,  54.29622 ],
         [ 15.513206,  85.32263 , 100.83584 , ...,  93.07923 ,
          100.83584 ,  38.783012]],

        [[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,  15.513206],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,  15.513206],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,  46.539616],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   7.756603],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,  93.07923 ],
         [ 54.29622 ,  23.269808,  46.539616, ...,  69.809425,
          124.105644, 162.88866 ]],

        [[  7.756603,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  7.756603,   0.      ,   7.756603, ...,   0.      ,
            0.      ,   0.      ],
         [  7.756603,   7.756603,  15.513206, ...,   7.756603,
            0.      ,   0.      ],
         ...,
         [  7.756603,   7.756603,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  7.756603,   7.756603,   7.756603, ...,   7.756603,
            7.756603,   0.      ],
         [  7.756603,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        ...,

        [[  7.756603,   0.      ,   0.      , ...,   7.756603,
            7.756603,   7.756603],
         [  7.756603,   0.      ,   0.      , ...,   7.756603,
            7.756603,   7.756603],
         [  0.      ,   0.      ,   0.      , ...,   7.756603,
            7.756603,   7.756603],
         ...,
         [  7.756603,   0.      ,   7.756603, ...,   7.756603,
            7.756603,   7.756603],
         [  7.756603,   0.      ,   7.756603, ...,   7.756603,
            7.756603,   7.756603],
         [  7.756603,   7.756603,   7.756603, ...,   7.756603,
            7.756603,   7.756603]],

        [[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,  23.269808, ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        [[100.83584 ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   7.756603],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]]]], dtype=float32), array([[[[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        [[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        [[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        ...,

        [[ 35.35034 ,  35.35034 ,  35.35034 , ...,  35.35034 ,
           35.35034 ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [ 35.35034 ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,  35.35034 ,
           35.35034 ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        [[353.50342 , 282.80273 , 247.4524  , ..., 318.15308 ,
          282.80273 , 318.15308 ],
         [388.85376 , 212.10205 , 106.051025, ..., 247.4524  ,
          176.75171 , 247.4524  ],
         [282.80273 ,  70.70068 ,  35.35034 , ..., 212.10205 ,
          212.10205 , 247.4524  ],
         ...,
         [282.80273 , 318.15308 , 247.4524  , ..., 212.10205 ,
          212.10205 , 247.4524  ],
         [318.15308 , 282.80273 , 282.80273 , ..., 282.80273 ,
          141.40137 , 247.4524  ],
         [388.85376 , 212.10205 , 247.4524  , ..., 212.10205 ,
          282.80273 , 318.15308 ]],

        [[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]]]], dtype=float32)]
Iteration 1, Elapsed Time: 1.4226746559143066
[None, array([[[[ 69.809425,  85.32263 ,  69.809425, ..., 201.67168 ,
          209.42827 , 108.59244 ],
         [ 46.539616,  77.566025,  62.052822, ..., 232.69809 ,
          232.69809 ,  93.07923 ],
         [ 15.513206,  62.052822,  77.566025, ..., 201.67168 ,
          217.18488 , 100.83584 ],
         ...,
         [ 62.052822, 193.91507 , 209.42827 , ..., 255.9679  ,
          232.69809 ,  69.809425],
         [ 77.566025, 224.94148 , 240.45468 , ..., 217.18488 ,
          201.67168 ,  54.29622 ],
         [ 15.513206,  85.32263 , 100.83584 , ...,  93.07923 ,
          100.83584 ,  38.783012]],

        [[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,  15.513206],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,  15.513206],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,  46.539616],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   7.756603],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,  93.07923 ],
         [ 54.29622 ,  23.269808,  46.539616, ...,  69.809425,
          124.105644, 162.88866 ]],

        [[  7.756603,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  7.756603,   0.      ,   7.756603, ...,   0.      ,
            0.      ,   0.      ],
         [  7.756603,   7.756603,  15.513206, ...,   7.756603,
            0.      ,   0.      ],
         ...,
         [  7.756603,   7.756603,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  7.756603,   7.756603,   7.756603, ...,   7.756603,
            7.756603,   0.      ],
         [  7.756603,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        ...,

        [[  7.756603,   0.      ,   0.      , ...,   7.756603,
            7.756603,   7.756603],
         [  7.756603,   0.      ,   0.      , ...,   7.756603,
            7.756603,   7.756603],
         [  0.      ,   0.      ,   0.      , ...,   7.756603,
            7.756603,   7.756603],
         ...,
         [  7.756603,   0.      ,   7.756603, ...,   7.756603,
            7.756603,   7.756603],
         [  7.756603,   0.      ,   7.756603, ...,   7.756603,
            7.756603,   7.756603],
         [  7.756603,   7.756603,   7.756603, ...,   7.756603,
            7.756603,   7.756603]],

        [[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,  23.269808, ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        [[100.83584 ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   7.756603],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]]]], dtype=float32), array([[[[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        [[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        [[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        ...,

        [[ 35.35034 ,  35.35034 ,  35.35034 , ...,  35.35034 ,
           35.35034 ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [ 35.35034 ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,  35.35034 ,
           35.35034 ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]],

        [[353.50342 , 282.80273 , 247.4524  , ..., 318.15308 ,
          282.80273 , 318.15308 ],
         [388.85376 , 212.10205 , 106.051025, ..., 247.4524  ,
          176.75171 , 247.4524  ],
         [282.80273 ,  70.70068 ,  35.35034 , ..., 212.10205 ,
          212.10205 , 247.4524  ],
         ...,
         [282.80273 , 318.15308 , 247.4524  , ..., 212.10205 ,
          212.10205 , 247.4524  ],
         [318.15308 , 282.80273 , 282.80273 , ..., 282.80273 ,
          141.40137 , 247.4524  ],
         [388.85376 , 212.10205 , 247.4524  , ..., 212.10205 ,
          282.80273 , 318.15308 ]],

        [[  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         ...,
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ],
         [  0.      ,   0.      ,   0.      , ...,   0.      ,
            0.      ,   0.      ]]]], dtype=float32)]
Iteration 2, Elapsed Time: 1.4222159385681152
minimum execution time is 1422.216 ms
Total time to run 2 iterations was 2.8449 s
(75.252855, 74.68371, 0.0, 0.0)
{'ts:run_start': 219800239275, 'ts:run_end': 221218953690, 'ddr:read_start': 0, 'ddr:read_end': 0, 'ddr:write_start': 0, 'ddr:write_end': 0, 'ts:subgraph_input.332add_output_copy_in_start': 219800538410, 'ts:subgraph_input.332add_output_copy_in_end': 219803163280, 'ts:subgraph_input.332add_output_proc_start': 219803163540, 'ts:subgraph_input.332add_output_proc_end': 219877847250, 'ts:subgraph_input.332add_output_copy_out_start': 219877848725, 'ts:subgraph_input.332add_output_copy_out_end': 221218685415}

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

Hi Pragya,

Hmm, these results are interesting. I do see in your script_with_acc.txt that the results are indeed the same each time, even if the input is changing. The model_speed_test.py script shows same output, but this is for the same input.

Do you find that the model fails to initialize/run during the tests at this point?

What I'm noticing in the output is that is one of two values: 0 or 7.7566 -- that's strange to me. I wonder if this issue is somehow related to clipping from the quantization values. I notice that the outputs for the network from the model_speed_test.py script are quite high as well. Further, I notice that all the output values are evenly divisible by 7.7566. I imagine this is not right, since the CPU / non-acc version had many low-value floating point numbers.

I think we should look at the quantization of the model at this stage:

1) Try recompiling the model with "tensor_bits" set to 16 (passed as part of the delegate options). This will impact performance, but we can use a hybrid of 9 and 16-bit quantization to optimize the accuracy vs. performance tradeoff

2) Consider the calibration images and number of iterations. What is the setting for these? If not set as part of your model config, then the default values are within common_utils.py (calibration_iterations, calibration_frames).

You should use images from the training or validation set here if you aren't already. The default images may be resulting in poor quantization parameters when applied to your use case (quick search of patchcore suggests defect detection, so very different from scenes in automative and classifier info)

BR,
Reese

0 Pragya Kapoor 1 month ago in reply to Reese Grimsley

Prodigy 40 points

Hello,

1. I tried recompiling the model with tensor_bits set to 16
2. calibration images are set to 2 and iterations to 5(default)
3. I used images from my training data in the compilation script.

Even after doing all this, the result is the same :(

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

Hi Pragya,

Pragya Kapoor said:
1. I tried recompiling the model with tensor_bits set to 16

Did the performance change noticeably, and did the outputs still look identical (multiples of this 7.7566 value)? I'm surprised the issue persisted after this, and I want to be sure the setting applied correctly. The SVG files with the artifacts/tempDir directory would be helpful for me to look at too.

edit: One more thing popped up in my mind as it relates to preprocessing.

You have preprocesing values of 0.485, 0.456 and 0.406 for the mean, which gets subtracted from the input. I recall these values as being the traditional means in pytorch for imagenet classification models (and by extension, most other models that start from such a feature extractor / spine). In pytorch, I believe the inputs are normalized to [0,1] first.

In our preprocessing with edgeai-tidl-tools, we read images as uint8, so the starting distribution is between 0 and 255. If I look at the mean in some of our model_configs, multiplying those pytorch values by 255 results in the 'mean' values part of the model configs. Similarly, dividing the scales from your model by 255 results in scale values very close to some of our other models.

pytorch transforms for imagenet: https://github.com/pytorch/examples/blob/cdef4d43fb1a2c6c4349daa5080e4e8731c34569/imagenet/main.py#L236
- the means make sense to me
- The std/scales in pytroch don't map so well (0.229 vs. our 0.017125). However, your scales do have some linear relation to the others in our model configs (4.366 / 255 = 0.017)
- The actual transform code will subtract mean and divide that by the std
  - we subtract mean and multiply by the scale, so the std value should be inverted. That's what looks like you've done

This may be coincidental, but seems suspicious enough to mention. Subtracting a mean of 0.45 and dividing by 4.4 doesn't seem like an intentional choice w.r.t. the resulting distribution. This concept is something we don't have well documented, and I realize that would be a source of confusion (noted in my docs backlog now). I suggest revisiting these parameters on your side. I would suggest multiplying your mean values by 255 and dividing your scales (1/std) by 255. Modify those and recompile -- they will have a first order effect on the quantization of the model.

BR,
Reese

0 Pragya Kapoor 1 month ago in reply to Reese Grimsley

Prodigy 40 points

Hello Reese,

I tried your suggestions and used this as my model config :

'cl-ort-patchcore' : { 'model_path' : os.path.join(models_base_path,
'patchcore_model_buffered.onnx'),
'mean': [123.675, 116.28, 103,53],
'scale' : [0.017125, 0.017507, 0.017429],
'num_images' : 2,
'num_classes': 2,
'session_name' : 'onnxrt' ,
'model_type': 'classification' }

While this helped a little, because now the model outputs and scores are being updated. The score values are too high and the heatmap isn't changing still.
I feel the output values beings generated are still not entirely correct.
I am assuming I am still supposed to use these values in my inference script :

mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]

I have attached the log file using these new artifacts and the link to the new artifacts : drive.google.com/.../1hNAvUqlz3-oJ4WNqRDZebvilGTG3Rf5w

8400.script_with_acc.txt

edit: I had forgotten that I had changed the tensor_bits to 16 which is probably what was causing the score values being too high and heatmap still incorrect. I now changed it back to tensor_bits 8 and now everything is working as expected. So, in conclusion setting the correct mean, std and tensor_bits values solved the problem. Thank you so much for all your help :)

0 Reese Grimsley 1 month ago in reply to Pragya Kapoor

TI__Genius 9316 points

Hi Pragya,

Great, I'm glad to hear that this was the solution.

I think our documentation would be aided by a note about these preprocessing parameters, how they are applied, and what types of common values are used (e.g. these are straight from pytorch, but that is not obvious). The edgeai-tidl-tools/docs/custom_model_evaluation.md page is where that information should go.

You're welcome for the help!

BR,
-Reese