SK-AM62A-LP: Unable to perform inferencing using custom tflite model on DL accelerator

Part Number: SK-AM62A-LP

Tool/software:

Hello team,

My team and I are working on running an ML model capable of performing low-light image enhancement on the AM62A EVM. We are trying to offload some of the processing to the DL accelerator. We were able to compile the model using the edgeai_tidl_tools and generate the artifacts of the model. 

We used the edgeai_gst_apps to run the model. We wrote the following config file: 

title: "Low Light Enhancement"
log_level: 2
inputs:
    input0:
        source: /dev/video-usb-cam0
        format: jpeg
        width: 1280
        height: 720
        framerate: 30
    input1:
        source: /home/weston/image.jpg
        width: 600
        height: 400
        index: 0
        framerate: 1
        loop: False
models:
    model0:
        model_path: /opt/model_zoo/zero_dce_model
outputs:
    output0:
        sink: kmssink
        width: 1920
        height: 1080
        overlay-perf-type: graph
    output1:
        sink: /home/weston/image_enhanced.jpg  
        width: 600
        height: 400
flows:
    flow0: [input1,model0,output1]

The paths provided in the config file have been verified. But, running the app_edgeai.py script gives the following output, and the enhanced image is not generated:

 Number of subgraphs:8 , 43 nodes delegated out of 51 nodes

APP: Init ... !!!
    72.527621 s: MEM: Init ... !!!
    72.527692 s: MEM: Initialized DMA HEAP (fd=6) !!!
    72.527880 s: MEM: Init ... Done !!!
    72.527909 s: IPC: Init ... !!!
    72.544456 s: IPC: Init ... Done !!!
REMOTE_SERVICE: Init ... !!!
REMOTE_SERVICE: Init ... Done !!!
    72.552242 s: GTC Frequency = 200 MHz
APP: Init ... Done !!!
    72.555707 s:  VX_ZONE_INIT:Enabled
    72.555749 s:  VX_ZONE_ERROR:Enabled
    72.555759 s:  VX_ZONE_WARNING:Enabled
    72.556967 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target MPU-0
    72.557151 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target MPU-1
    72.557257 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target MPU-2
    72.557352 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target MPU-3
    72.557366 s:  VX_ZONE_INIT:[tivxInitLocal:136] Initialization Done !!!
    72.558931 s:  VX_ZONE_INIT:[tivxHostInitLocal:106] Initialization Done for HOST !!!
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
==========[INPUT PIPELINE(S)]==========

[PIPE-0]

multifilesrc location=/home/weston/image.jpg index=1 ! jpegdec ! videoscale qos=True ! capsfilter caps="video/x-raw, width=(int)600, height=(int)400;" ! tiovxdlcolorconvert ! capsfilter caps="video/x-raw, format=(string)NV12;" ! tiovxmultiscaler name=split_01  
split_01. ! queue ! capsfilter caps="video/x-raw, width=(int)600, height=(int)400;" ! tiovxdlcolorconvert out-pool-size=4 ! capsfilter caps="video/x-raw, format=(string)RGB;" ! appsink max-buffers=2 drop=True name=sen_0
split_01. ! queue ! capsfilter caps="video/x-raw, width=(int)600, height=(int)400;" ! videoscale qos=True ! capsfilter caps="video/x-raw, width=(int)600, height=(int)600;" ! tiovxdlpreproc out-pool-size=4 channel-order=1 ! capsfilter caps="application/x-tensor-tiovx;" ! appsink max-buffers=2 drop=True name=pre_0


==========[OUTPUT PIPELINE]==========

    73.144299 s: MEM: ERROR: Alloc failed with status = 12 !!!
    73.144351 s:  VX_ZONE_ERROR:[tivxMemBufferAlloc:90] Shared mem ptr allocation failed
    73.145215 s: MEM: ERROR: Alloc failed with status = 12 !!!
    73.145253 s:  VX_ZONE_ERROR:[tivxMemBufferAlloc:90] Shared mem ptr allocation failed
    73.151641 s: MEM: ERROR: Alloc failed with status = 12 !!!
    73.151696 s:  VX_ZONE_ERROR:[tivxMemBufferAlloc:90] Shared mem ptr allocation failed
appsrc do-timestamp=True format=3 block=True name=post_0 ! tiovxdlcolorconvert ! capsfilter caps="video/x-raw, format=(string)NV12, width=(int)600, height=(int)400;" ! v4l2jpegenc ! multifilesink sync=False location=/home/weston/image_enhanced.jpg

    73.183287 s: MEM: ERROR: Alloc failed with status = 12 !!!
    73.183343 s:  VX_ZONE_ERROR:[tivxMemBufferAlloc:90] Shared mem ptr allocation failed
    73.184087 s: MEM: ERROR: Alloc failed with status = 12 !!!
    73.184120 s:  VX_ZONE_ERROR:[tivxMemBufferAlloc:90] Shared mem ptr allocation failed
    73.184637 s: MEM: ERROR: Alloc failed with status = 12 !!!
    73.184662 s:  VX_ZONE_ERROR:[tivxMemBufferAlloc:90] Shared mem ptr allocation failed
    73.185169 s: MEM: ERROR: Alloc failed with status = 12 !!!
    73.185190 s:  VX_ZONE_ERROR:[tivxMemBufferAlloc:90] Shared mem ptr allocation failed


 +--------------------------------------------------------------------------+
 | Low Light Enhancement                                                    |
 +--------------------------------------------------------------------------+
 +--------------------------------------------------------------------------+
 | Input Src: /home/weston/image.jpg                                        |
 | Model Name: zero_dce_model                                               |
 | Model Type: custom                                                       |
 +--------------------------------------------------------------------------+
 +--------------------------------------------------------------------------+


 +--------------------------------------------------------------------------+
 | Low Light Enhancement                                                    |
 +--------------------------------------------------------------------------+
 +--------------------------------------------------------------------------+
 | Input Src: /home/weston/image.jpg                                        |
 | Model Name: zero_dce_model                                               |
 | Model Type: custom                                                       |
 +--------------------------------------------------------------------------+
 +--------------------------------------------------------------------------+


    75.493006 s:  VX_ZONE_WARNING:[vxReleaseContext:1213] Found a reference 0xffff781164a8 of type 00000815 at external count 7, internal count 0, releasing it
    75.493056 s:  VX_ZONE_WARNING:[vxReleaseContext:1215] Releasing reference (name=tensor_202) now as a part of garbage collection
    75.493091 s:  VX_ZONE_WARNING:[vxReleaseContext:1213] Found a reference 0xffff780f7a18 of type 00000813 at external count 1, internal count 0, releasing it
    75.493104 s:  VX_ZONE_WARNING:[vxReleaseContext:1215] Releasing reference (name=object_array_208) now as a part of garbage collection
    75.493524 s:  VX_ZONE_WARNING:[vxReleaseContext:1213] Found a reference 0xffff780f7bc0 of type 00000813 at external count 1, internal count 0, releasing it
    75.493539 s:  VX_ZONE_WARNING:[vxReleaseContext:1215] Releasing reference (name=object_array_210) now as a part of garbage collection
    75.493919 s:  VX_ZONE_WARNING:[vxReleaseContext:1213] Found a reference 0xffff780f7d68 of type 00000813 at external count 1, internal count 0, releasing it
    75.493934 s:  VX_ZONE_WARNING:[vxReleaseContext:1215] Releasing reference (name=object_array_214) now as a part of garbage collection
    75.494401 s:  VX_ZONE_INIT:[tivxHostDeInitLocal:120] De-Initialization Done for HOST !!!
    75.499484 s:  VX_ZONE_INIT:[tivxDeInitLocal:206] De-Initialization Done !!!
APP: Deinit ... !!!
REMOTE_SERVICE: Deinit ... !!!
REMOTE_SERVICE: Deinit ... Done !!!
    75.499976 s: IPC: Deinit ... !!!
    75.500485 s: IPC: DeInit ... Done !!!
    75.500532 s: MEM: Deinit ... !!!
    75.500549 s: DDR_SHARED_MEM: Alloc's: 109 alloc's of 200056184 bytes
    75.500561 s: DDR_SHARED_MEM: Free's : 109 free's  of 200056184 bytes
    75.500571 s: DDR_SHARED_MEM: Open's : 0 allocs  of 0 bytes
    75.500587 s: MEM: Deinit ... Done !!!
APP: Deinit ... Done !!!

The compiled model
zero_dce_model.zip

  • Hello,

    I took a look at your model and model artifacts. The model should not be taking enough memory / DDR to run out of memory, as some of the printouts suggest. More likely, something failed during compilation. 

    Your model has many subgraphs due the Square nodes, which we don't support (see documentation here for supported ops). I recommend modifying your model to replace Square with Mul nodes, and multiply the tensor by itself.

    Sometimes many subgraphs introduces higher likelihood of error -- there are more corner cases in this scenario. The tools are more stable for models that can be compiled with 1-2 subgraphs (yours has 8, and the maximum is 16).

    Please also set the batch-size to be 1, instead of a dynamic value like -1

    At this stage, I would suggest running this model without a full imaging pipeline, so you can check the model in a simpler application.

    I recommend using edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py for this testing. You can use this to compile your model, then transfer part of the repo to the SDK (mostly, you can transfer everything but tidl_tools directory, which is only for compile/emulation on x86). You then run the same way as on PC for host emulation mode (python3 onnxrt_ep.py -m MODEL_NAME)

    When you are running your model in this mode, where it is just using a single-image, please run with debug_level=2. You can set this in your model config as part of 'optional_options' or set globally within common_utils.py. Please also run /opt/vx_app_arm_remote_log.out in the background, then run the onnxrt_ep.py and share the logs. 

    Summarized:

    - in your model, replace Square with Mul or equivalent node
    - recompile and run your model on target with edgeai-tidl-tools. If this is still failing, please share compilation and runtime logs with debug_level=2 and (only for EVM) /opt/vx_app_arm_remote_log.out running the background

    BR,
    Reese

  • Hello Reese,

    Thanks for the advice. We replaced all the square nodes with mul nodes and on compiling the model, we were able to get all the nodes in one single subgraph.

    The new compiled model

    zero_dce_model_no_sq.zip

    Regarding running the model on edgeai-tidl-tools, since our model doesn't come under any of the three available categories (classification, detection, or segmentation), there was no logic in the program that would handle writing output images that were generated by our model. Instead we tried to run it on a custom python script on the EVM.

    from tflite_runtime.interpreter import Interpreter as tfl
    from tflite_runtime.interpreter import load_delegate
    import numpy as np
    import cv2
    import time
    
    tidl_delegate_path = '/usr/lib/libtidl_tfl_delegate.so'
    tidl_options = {"artifacts_folder": "zero_dce_model_no_sq/artifacts","debug_level": 2,}
    tidl_delegate = load_delegate(tidl_delegate_path, options=tidl_options)
    
    modelpath = 'zero_dce_model_no_sq/model/zero_dce_model_no_sq.tflite'
    
    input_image = cv2.imread("image.png")
    
    interpret = tfl(model_path=modelpath,experimental_delegates=[tidl_delegate])
    interpret.allocate_tensors()
    input_details = interpret.get_input_details()
    output_details = interpret.get_output_details()
    
    input_image = cv2.resize(input_image, (600, 400))
    input_data = np.array(input_image).astype(np.float32) / 255.0
    input_data = np.expand_dims(input_data, axis=0) 
    interpret.set_tensor(input_details[0]['index'], input_data)
    interpret.invoke()
    
    output_data = interpret.get_tensor(output_details[0]['index'])
    output_image_data = (output_data[0] * 255).clip(0, 255).astype(np.uint8)
    cv2.imwrite("image_processed_no_sq.png", output_image_data)
    print ('Done') 
    print ('Closing')

    This script did run, but the output we got was a plain black image.

    The log generated by the custom script:

    prog_log.txt
     ****** In DelegatePrepare ****** 
    
     Number of subgraphs:1 , 51 nodes delegated out of 51 nodes 
     
    
     ****** In tidlDelegate::Init ****** 
    ************ in TIDL_subgraphRtCreate ************ 
     APP: Init ... !!!
       486.838453 s: MEM: Init ... !!!
       486.838496 s: MEM: Initialized DMA HEAP (fd=6) !!!
       486.838677 s: MEM: Init ... Done !!!
       486.838693 s: IPC: Init ... !!!
       486.855096 s: IPC: Init ... Done !!!
    REMOTE_SERVICE: Init ... !!!
    REMOTE_SERVICE: Init ... Done !!!
       486.861330 s: GTC Frequency = 200 MHz
    APP: Init ... Done !!!
       486.865418 s:  VX_ZONE_INIT:Enabled
       486.865433 s:  VX_ZONE_ERROR:Enabled
       486.865436 s:  VX_ZONE_WARNING:Enabled
       486.868129 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target MPU-0 
       486.868243 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target MPU-1 
       486.868335 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target MPU-2 
       486.868453 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target MPU-3 
       486.868458 s:  VX_ZONE_INIT:[tivxInitLocal:136] Initialization Done !!!
       486.869497 s:  VX_ZONE_INIT:[tivxHostInitLocal:106] Initialization Done for HOST !!!
    RT-Profile: TIDLRT_init_profiling 
    tidlrt_create            :       86814675 ns,
    tidl_rt_ovx_Init         :       33057695 ns,
    vxCreateContext          :        2664215 ns,
    init_tidl_tiovx          :        2990560 ns,
    create_graph_tidl_tiovx  :        4878950 ns,
    verify_graph_tidl_tiovx  :       42955500 ns,
    tivxTIDLLoadKernels      :          30150 ns,
    mapConfig                :         552880 ns,
    tivxAddKernelTIDL        :          82220 ns,
    mapNetwork               :        1782435 ns,
    setCreateParams          :         249540 ns,
    setArgs                  :         290495 ns,
    vxCreateUserDataObject   :          34530 ns,
    vxMapUserDataObject      :         908360 ns,
    memcopy_network_buffer   :         729845 ns,
    vxUnmapUserDataObject    :         107820 ns,
    ************ TIDL_subgraphRtCreate done ************ 
     
     ****** tidlDelegate::Prepare ****** 
     Outputs Tensor name and id -  StatefulPartitionedCall_1:0, 75
    
     ****** tidlDelegate::Invoke ****** 
    *******   In TIDL_subgraphRtInvoke  ******** 
     Layer,   Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger,    paddingWait,LayerWithoutPad,LayerHandleCopy,   BackupCycles,  RestoreCycles,Multic7xContextCopyCycles,
         1,        6664704,        6524641,        6583755,          10260,           6592,                 0,              0,                 0,              0,              0,            597,              1,              0,           5398,              0,              0,              0,
         2,         175751,          48791,          97572,           8463,          10213,                 0,              0,                 0,              0,              0,            839,              1,              0,           2503,              0,              0,              0,
         3,         213193,          57266,         138710,          10742,           6313,                 0,              0,                 0,              0,              0,            590,              1,              0,           2788,              0,              0,              0,
         4,         859286,         285318,         781582,          15859,           5083,                 0,              0,                 0,              0,              0,            607,              1,              0,           3443,              0,              0,              0,
         5,        2847123,        2368438,        2761321,          17552,          10821,                 0,              0,                 0,              0,              0,            427,              1,              0,           3312,              0,              0,              0,
         6,        2846845,        2369640,        2762339,          17966,           9921,                 0,              0,                 0,              0,              0,            546,              1,              0,           3662,              0,              0,              0,
         7,        2845439,        2370609,        2763519,          18273,           7462,                 0,              0,                 0,              0,              0,            840,              1,              0,           3641,              0,              0,              0,
         8,        2577073,         806326,        2499059,          12730,           6901,                 0,              0,                 0,              0,              0,            519,              1,              0,           3898,              0,              0,              0,
         9,        6376776,        5005757,        6294115,          16131,           8388,                 0,              0,                 0,              0,              0,            792,              1,              0,           3968,              0,              0,              0,
        10,        2573611,         807396,        2495592,          12208,           6995,                 0,              0,                 0,              0,              0,            525,              1,              0,           3145,              0,              0,              0,
        11,        6425345,        5000760,        6341592,          18177,           7951,                 0,              0,                 0,              0,              0,            586,              1,              0,           3970,              0,              0,              0,
        12,        2566567,         804956,        2488945,          12107,           8185,                 0,              0,                 0,              0,              0,            924,              1,              0,           3106,              0,              0,              0,
        13,        6447402,        5021880,        6363661,          18140,           8929,                 0,              0,                 0,              0,              0,            681,              1,              0,           4085,              0,              0,              0,
        14,       11093462,       10871772,       11019555,           8698,           7047,                 0,              0,                 0,              0,              0,            632,              1,              0,           2858,              0,              0,              0,
        15,          56497,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              1,              0,              0,              0,              0,              0,
        16,         168293,          50079,          94965,           8524,           8057,                 0,              0,                 0,              0,              0,            535,              1,              0,           2702,              0,              0,              0,
        17,         214305,          56206,         140114,          10962,           6044,                 0,              0,                 0,              0,              0,            438,              1,              0,           3006,              0,              0,              0,
        18,         169068,          50453,          95853,           8953,           7200,                 0,              0,                 0,              0,              0,            645,              1,              0,           3003,              0,              0,              0,
        19,         209307,          56862,         135228,          10667,           6085,                 0,              0,                 0,              0,              0,            315,              1,              0,           2922,              0,              0,              0,
        20,          56425,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              1,              0,              0,              0,              0,              0,
        21,         170512,          46739,          91740,           8738,          13147,                 0,              0,                 0,              0,              0,            787,              1,              0,           3049,              0,              0,              0,
        22,         213584,          54903,         138268,          10870,           8332,                 0,              0,                 0,              0,              0,            531,              1,              0,           3136,              0,              0,              0,
        23,         170391,          46721,          91484,           8742,          13446,                 0,              0,                 0,              0,              0,            839,              1,              0,           3033,              0,              0,              0,
        24,         209842,          56768,         135900,          10895,           6002,                 0,              0,                 0,              0,              0,            410,              1,              0,           2914,              0,              0,              0,
        25,          56676,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              1,              0,              0,              0,              0,              0,
        26,         170069,          46775,          92735,           8962,          12661,                 0,              0,                 0,              0,              0,            573,              1,              0,           3003,              0,              0,              0,
        27,         213474,          55016,         138431,          10425,           8331,                 0,              0,                 0,              0,              0,            436,              1,              0,           2669,              0,              0,              0,
        28,         170677,          46728,          91650,           8676,          13518,                 0,              0,                 0,              0,              0,            888,              1,              0,           3089,              0,              0,              0,
        29,         209486,          56943,         136166,          10301,           6204,                 0,              0,                 0,              0,              0,            547,              1,              0,           2830,              0,              0,              0,
        30,          56617,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              1,              0,              0,              0,              0,              0,
        31,         170540,          47331,          92827,           8453,          12469,                 0,              0,                 0,              0,              0,            542,              1,              0,           2984,              0,              0,              0,
        32,         213592,          54997,         138069,          10925,           8337,                 0,              0,                 0,              0,              0,            437,              1,              0,           2958,              0,              0,              0,
        33,         169468,          46776,          91540,           8967,          12813,                 0,              0,                 0,              0,              0,            492,              1,              0,           3083,              0,              0,              0,
        34,         210081,          56856,         135566,          10715,           6051,                 0,              0,                 0,              0,              0,            522,              1,              0,           2863,              0,              0,              0,
        35,          56759,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              1,              0,              0,              0,              0,              0,
        36,         170956,          46898,          93168,           8778,          13077,                 0,              0,                 0,              0,              0,            573,              1,              0,           3027,              0,              0,              0,
        37,         213770,          54812,         138718,          10774,           8301,                 0,              0,                 0,              0,              0,            573,              1,              0,           2983,              0,              0,              0,
        38,         170492,          46681,          91779,           8887,          12925,                 0,              0,                 0,              0,              0,            582,              1,              0,           3052,              0,              0,              0,
        39,         209583,          56904,         135599,          10702,           6323,                 0,              0,                 0,              0,              0,            591,              1,              0,           2880,              0,              0,              0,
        40,          56155,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              1,              0,              0,              0,              0,              0,
        41,         170971,          46743,          93042,           8989,          13069,                 0,              0,                 0,              0,              0,            424,              1,              0,           3027,              0,              0,              0,
        42,         213539,          54844,         138381,          10923,           8278,                 0,              0,                 0,              0,              0,            634,              1,              0,           3065,              0,              0,              0,
        43,         170907,          46738,          92092,           8934,          13285,                 0,              0,                 0,              0,              0,            427,              1,              0,           2984,              0,              0,              0,
        44,         210432,          56759,         135806,          10726,           6051,                 0,              0,                 0,              0,              0,            896,              1,              0,           2550,              0,              0,              0,
        45,          55868,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              1,              0,              0,              0,              0,              0,
        46,         169675,          46701,          93444,           8877,          12591,                 0,              0,                 0,              0,              0,            967,              1,              0,           3027,              0,              0,              0,
        47,         214927,          54800,         138934,          11165,           8292,                 0,              0,                 0,              0,              0,            503,              1,              0,           3065,              0,              0,              0,
        48,         171364,          46727,          92340,           8495,          13523,                 0,              0,                 0,              0,              0,            589,              1,              0,           2954,              0,              0,              0,
        49,         209215,          56908,         136025,          10509,           5842,                 0,              0,                 0,              0,              0,          Done
    Closing
      432,              1,              0,           2941,              0,              0,              0,
        50,          56928,              0,              0,              0,              0,                 0,              0,                 0,              0,              0,              0,              1,              0,              0,              0,              0,              0,
        51,         170399,          46756,          92457,           8713,          12504,                 0,              0,                 0,              0,              0,            385,              1,              0,           3027,              0,              0,              0,
        52,         166019,          54845,          93367,           8406,           7997,                 0,              0,                 0,              0,              0,           1042,              1,              0,           2715,              0,              0,              0,
        53,        4116545,        3987502,        4045874,           8722,           2452,                 0,              0,                 0,              0,              0,            329,              1,              0,           3097,              0,              0,              0,
     Sum of Layer Cycles 64765985 
    Sub Graph Stats 2510.000000 79665.000000 4600.000000 
    *******  TIDL_subgraphRtInvoke done  ******** 
    ************ in ~tidlDelegate ************ 
     ************ in TIDL_subgraphRtDelete ************ 
        487.137905 s:  VX_ZONE_INIT:[tivxHostDeInitLocal:120] De-Initialization Done for HOST !!!
       487.142404 s:  VX_ZONE_INIT:[tivxDeInitLocal:206] De-Initialization Done !!!
    APP: Deinit ... !!!
    REMOTE_SERVICE: Deinit ... !!!
    REMOTE_SERVICE: Deinit ... Done !!!
       487.142817 s: IPC: Deinit ... !!!
       487.143283 s: IPC: DeInit ... Done !!!
       487.143303 s: MEM: Deinit ... !!!
       487.143310 s: DDR_SHARED_MEM: Alloc's: 7 alloc's of 10504776 bytes 
       487.143314 s: DDR_SHARED_MEM: Free's : 7 free's  of 10504776 bytes 
       487.143317 s: DDR_SHARED_MEM: Open's : 0 allocs  of 0 bytes 
       487.143326 s: MEM: Deinit ... Done !!!
    APP: Deinit ... Done !!!
    

    The log generated by vx_app_arm_remote_log.out running in the background

    remote_log.txt
    [MCU1_0]    330.414477 s: CIO: Init ... Done !!!
    [MCU1_0]    330.414534 s: APP: Init ... !!!
    [MCU1_0]    330.414554 s: MEM: Init ... !!!
    [MCU1_0]    330.414571 s: MEM: Created heap (DDR_LOCAL_MEM, id=0, flags=0x00000004) @ af000000 of size 16777216 bytes !!!
    [MCU1_0]    330.414626 s: MEM: Init ... Done !!!
    [MCU1_0]    330.414643 s: IPC: Init ... !!!
    [MCU1_0]    330.414660 s: IPC: 3 CPUs participating in IPC !!!
    [MCU1_0]    330.414962 s: IPC: Waiting for HLOS to be ready ... !!!
    [MCU1_0]    330.419288 s: Sciserver Version: v2023.11.0.0REL.MCUSDK.MM.NN.PP.bb
    [MCU1_0]    330.422061 s: ##RM_PM_HAL Version: vMM.NN.PP
    [MCU1_0]    330.425008 s: ##Starting Sciserver..... PASSED
    [MCU1_0]    344.084579 s: IPC: HLOS is ready !!!
    [MCU1_0]    344.084664 s: IPC: Init ... Done !!!
    [MCU1_0]    344.084688 s: APP: Syncing with 2 CPUs ... !!!
    [MCU1_0]    344.084711 s: APP: Syncing with 2 CPUs ... Done !!!
    [MCU1_0]    344.084730 s: REMOTE_SERVICE: Init ... !!!
    [MCU1_0]    344.084803 s: REMOTE_SERVICE: Init ... Done !!!
    [MCU1_0]    344.084826 s: FVID2: Init ... !!!
    [MCU1_0]    344.084854 s: FVID2: Init ... Done !!!
    [MCU1_0]    344.084872 s: VHWA: VPAC Init ... !!!
    [MCU1_0]    344.084887 s: SCICLIENT: Sciclient_pmSetModuleState module=219 state=2
    [MCU1_0]    344.084971 s: SCICLIENT: Sciclient_pmSetModuleState success
    [MCU1_0]    344.085103 s: VHWA: LDC Init ... !!!
    [MCU1_0]    344.085199 s: VHWA: LDC Init ... Done !!!
    [MCU1_0]    344.085223 s: VHWA: MSC Init ... !!!
    [MCU1_0]    344.085818 s: VHWA: MSC Init ... Done !!!
    [MCU1_0]    344.085843 s: VHWA: VISS Init ... !!!
    [MCU1_0]    344.086649 s: VHWA: VISS Init ... Done !!!
    [MCU1_0]    344.086678 s: VHWA: VPAC Init ... Done !!!
    [MCU1_0]    344.086746 s:  VX_ZONE_INIT:Enabled
    [MCU1_0]    344.086764 s:  VX_ZONE_ERROR:Enabled
    [MCU1_0]    344.086780 s:  VX_ZONE_WARNING:Enabled
    [MCU1_0]    344.087719 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target MCU1-0 
    [MCU1_0]    344.087801 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target VPAC_LDC1 
    [MCU1_0]    344.087872 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target VPAC_MSC1 
    [MCU1_0]    344.087941 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target VPAC_MSC2 
    [MCU1_0]    344.088015 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target VPAC_VISS1 
    [MCU1_0]    344.088047 s:  VX_ZONE_INIT:[tivxInitLocal:136] Initialization Done !!!
    [MCU1_0]    344.088069 s: APP: OpenVX Target kernel init ... !!!
    [MCU1_0]    344.092562 s: APP: OpenVX Target kernel init ... Done !!!
    [MCU1_0]    344.092589 s: VISS REMOTE SERVICE: Init ... !!!
    [MCU1_0]    344.092632 s: VISS REMOTE SERVICE: Init ... Done !!!
    [MCU1_0]    344.092651 s: APP: Init ... Done !!!
    [MCU1_0]    344.092668 s: APP: Run ... !!!
    [MCU1_0]    344.092682 s: IPC: Starting echo test ...
    [MCU1_0]    344.092796 s: APP: Run ... Done !!!
    [MCU1_0]    344.093630 s: IPC: Echo status: a530-0[.] r5f0-0[s] c75ss0[P] 
    [C7x_1 ]    336.235545 s: CIO: Init ... Done !!!
    [C7x_1 ]    336.235565 s: APP: Init ... !!!
    [C7x_1 ]    336.235577 s: SCICLIENT: Init ... !!!
    [C7x_1 ]    336.235638 s: SCICLIENT: DMSC FW version [10.0.8--v10.00.08 (Fiery Fox)]
    [C7x_1 ]    336.235657 s: SCICLIENT: DMSC FW revision 0xa  
    [C7x_1 ]    336.235673 s: SCICLIENT: DMSC FW ABI revision 4.0
    [C7x_1 ]    336.235689 s: SCICLIENT: Init ... Done !!!
    [C7x_1 ]    336.235703 s: UDMA: Init ... !!!
    [C7x_1 ]    336.235716 s: UDMA: Init ... Done !!!
    [C7x_1 ]    336.235728 s: MEM: Init ... !!!
    [C7x_1 ]    336.235742 s: MEM: Created heap (DDR_LOCAL_MEM, id=0, flags=0x00000004) @ b2000000 of size 117440512 bytes !!!
    [C7x_1 ]    336.235770 s: MEM: Init ... Done !!!
    [C7x_1 ]    336.235782 s: IPC: Init ... !!!
    [C7x_1 ]    336.235795 s: IPC: 3 CPUs participating in IPC !!!
    [C7x_1 ]    336.235996 s: IPC: Waiting for HLOS to be ready ... !!!
    [C7x_1 ]    343.716544 s: IPC: HLOS is ready !!!
    [C7x_1 ]    343.716617 s: IPC: Init ... Done !!!
    [C7x_1 ]    343.716631 s: APP: Syncing with 2 CPUs ... !!!
    [C7x_1 ]    344.084713 s: APP: Syncing with 2 CPUs ... Done !!!
    [C7x_1 ]    344.084731 s: REMOTE_SERVICE: Init ... !!!
    [C7x_1 ]    344.085571 s: REMOTE_SERVICE: Init ... Done !!!
    [C7x_1 ]    344.085600 s:  VX_ZONE_INIT:Enabled
    [C7x_1 ]    344.085618 s:  VX_ZONE_ERROR:Enabled
    [C7x_1 ]    344.085636 s:  VX_ZONE_WARNING:Enabled
    [C7x_1 ]    344.086360 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1 
    [C7x_1 ]    344.086455 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_2 
    [C7x_1 ]    344.086554 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_3 
    [C7x_1 ]    344.086648 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_4 
    [C7x_1 ]    344.086740 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_5 
    [C7x_1 ]    344.086833 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_6 
    [C7x_1 ]    344.086926 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_7 
    [C7x_1 ]    344.087020 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_8 
    [C7x_1 ]    344.087047 s:  VX_ZONE_INIT:[tivxInitLocal:136] Initialization Done !!!
    [C7x_1 ]    344.087065 s: APP: OpenVX Target kernel init ... !!!
    [C7x_1 ]    344.087271 s: APP: OpenVX Target kernel init ... Done !!!
    [C7x_1 ]    344.087288 s: APP: Init ... Done !!!
    [C7x_1 ]    344.087301 s: APP: Run ... !!!
    [C7x_1 ]    344.087313 s: IPC: Starting echo test ...
    [C7x_1 ]    344.087410 s: APP: Run ... Done !!!
    [C7x_1 ]    344.093711 s: IPC: Echo status: a530-0[.] r5f0-0[P] c75ss0[s] 
    [C7x_1 ]    486.885845 s: PREEMPTION: Requesting memory of size 1048576 for targetPriority = 0
    [C7x_1 ]    486.885873 s: 
    [C7x_1 ]    486.885894 s: --------------------------------------------
    [C7x_1 ]    486.885938 s: TIDL Memory size requiement (record wise):
    [C7x_1 ]    486.885985 s: MemRecNum   , Space               , Attribute   , Alignment   , Size(KBytes), BasePtr     
    [C7x_1 ]    486.886039 s: 0           , DDR Cacheable       , Persistent  ,  128, 19.27   , 0x00000000
    [C7x_1 ]    486.886090 s: 1           , DDR Cacheable       , Persistent  ,  128, 0.65    , 0x00000000
    [C7x_1 ]    486.886141 s: 2           , L1D                 , Scratch     ,  128, 16.00   , 0x00000000
    [C7x_1 ]    486.886189 s: 3           , L2                  , Scratch     ,  128, 224.00  , 0x00000000
    [C7x_1 ]    486.886239 s: 4           , L3/MSMC             , Scratch     ,  128, 1024.00 , 0x00000000
    [C7x_1 ]    486.886289 s: 5           , DDR Cacheable       , Persistent  ,  128, 683.41  , 0x00000000
    [C7x_1 ]    486.886338 s: 6           , DDR Cacheable       , Scratch     ,  128, 6.88    , 0x00000000
    [C7x_1 ]    486.886388 s: 7           , DDR Cacheable       , Persistent  ,  128, 52292.00, 0x00000000
    [C7x_1 ]    486.886438 s: 8           , DDR Cacheable       , Scratch     ,  128, 0.13    , 0x00000000
    [C7x_1 ]    486.886487 s: 9           , DDR Cacheable       , Scratch     ,  128, 3.13    , 0x00000000
    [C7x_1 ]    486.886536 s: 10          , DDR Cacheable       , Persistent  ,  128, 747.55  , 0x00000000
    [C7x_1 ]    486.886586 s: 11          , DDR Cacheable       , Scratch     ,  128, 512.25  , 0x00000000
    [C7x_1 ]    486.886634 s: 12          , DDR Cacheable       , Persistent  ,  128, 1024.00 , 0x00000000
    [C7x_1 ]    486.886684 s: 13          , DDR Cacheable       , Persistent  ,  128, 1421.64 , 0x00000000
    [C7x_1 ]    486.886734 s: 14          , DDR Cacheable       , Persistent  ,  128, 0.00    , 0x00000000
    [C7x_1 ]    486.886783 s: 15          , DDR Cacheable       , Persistent  ,  128, 78.50   , 0x00000000
    [C7x_1 ]    486.886818 s: --------------------------------------------
    [C7x_1 ]    486.886848 s: Total memory size requirement (space wise):
    [C7x_1 ]    486.886872 s: Mem Space , Size(KBytes)
    [C7x_1 ]    486.886895 s: L1D       , 16.00   
    [C7x_1 ]    486.886924 s: L2        , 224.00  
    [C7x_1 ]    486.886947 s: L3/MSMC   , 1024.00 
    [C7x_1 ]    486.886972 s: DDR Cacheable, 56789.40
    [C7x_1 ]    486.887000 s: --------------------------------------------
    [C7x_1 ]    486.887040 s: NOTE: Memory requirement in host emulation can be different from the same on EVM
    [C7x_1 ]    486.887084 s:       To get the actual TIDL memory requirement make sure to run on EVM with 
    [C7x_1 ]    486.887111 s:       debugTraceLevel = 2
    [C7x_1 ]    486.887123 s: 
    [C7x_1 ]    486.887146 s: --------------------------------------------
    [C7x_1 ]    486.890380 s: TIDL init call from ivision API 
    [C7x_1 ]    486.890400 s: 
    [C7x_1 ]    486.890420 s: --------------------------------------------
    [C7x_1 ]    486.890449 s: TIDL Memory size requiement (record wise):
    [C7x_1 ]    486.890493 s: MemRecNum   , Space               , Attribute   , Alignment   , Size(KBytes), BasePtr     
    [C7x_1 ]    486.890545 s: 0           , DDR Cacheable       , Persistent  ,  128, 19.27   , 0xb2026cc0
    [C7x_1 ]    486.890596 s: 1           , DDR Cacheable       , Persistent  ,  128, 0.65    , 0xb202ba80
    [C7x_1 ]    486.890645 s: 2           , L1D                 , Scratch     ,  128, 16.00   , 0x7f03c000
    [C7x_1 ]    486.890694 s: 3           , L2                  , Scratch     ,  128, 224.00  , 0x7f000000
    [C7x_1 ]    486.890743 s: 4           , L3/MSMC             , Scratch     ,  128, 1024.00 , 0x7e000000
    [C7x_1 ]    486.890791 s: 5           , DDR Cacheable       , Persistent  ,  128, 683.41  , 0xb202bdc0
    [C7x_1 ]    486.890841 s: 6           , DDR Cacheable       , Scratch     ,  128, 6.88    , 0xb9000000
    [C7x_1 ]    486.890890 s: 7           , DDR Cacheable       , Persistent  ,  128, 52292.00, 0xb20d6c00
    [C7x_1 ]    486.890948 s: 8           , DDR Cacheable       , Scratch     ,  128, 0.13    , 0xb9001c00
    [C7x_1 ]    486.890997 s: 9           , DDR Cacheable       , Scratch     ,  128, 3.13    , 0xb9002000
    [C7x_1 ]    486.891048 s: 10          , DDR Cacheable       , Persistent  ,  128, 747.55  , 0xb53e7c40
    [C7x_1 ]    486.891097 s: 11          , DDR Cacheable       , Scratch     ,  128, 512.25  , 0xb9003000
    [C7x_1 ]    486.891146 s: 12          , DDR Cacheable       , Persistent  ,  128, 1024.00 , 0xb54a2b00
    [C7x_1 ]    486.891196 s: 13          , DDR Cacheable       , Persistent  ,  128, 1421.64 , 0xb55a2b40
    [C7x_1 ]    486.891246 s: 14          , DDR Cacheable       , Persistent  ,  128, 0.00    , 0xb5706280
    [C7x_1 ]    486.891296 s: 15          , DDR Cacheable       , Persistent  ,  128, 78.50   , 0xb5706340
    [C7x_1 ]    486.891330 s: --------------------------------------------
    [C7x_1 ]    486.891360 s: Total memory size requirement (space wise):
    [C7x_1 ]    486.891384 s: Mem Space , Size(KBytes)
    [C7x_1 ]    486.891407 s: L1D       , 16.00   
    [C7x_1 ]    486.891429 s: L2        , 224.00  
    [C7x_1 ]    486.891451 s: L3/MSMC   , 1024.00 
    [C7x_1 ]    486.891475 s: DDR Cacheable, 56789.40
    [C7x_1 ]    486.891501 s: --------------------------------------------
    [C7x_1 ]    486.891541 s: NOTE: Memory requirement in host emulation can be different from the same on EVM
    [C7x_1 ]    486.891584 s:       To get the actual TIDL memory requirement make sure to run on EVM with 
    [C7x_1 ]    486.891611 s:       debugTraceLevel = 2
    [C7x_1 ]    486.891623 s: 
    [C7x_1 ]    486.891645 s: --------------------------------------------
    [C7x_1 ]    486.895690 s: Alg Init for Layer # -    1
    [C7x_1 ]    486.895803 s: Alg Init for Layer # -    2
    [C7x_1 ]    486.895912 s: Alg Init for Layer # -    3
    [C7x_1 ]    486.896019 s: Alg Init for Layer # -    4
    [C7x_1 ]    486.896408 s: Alg Init for Layer # -    5
    [C7x_1 ]    486.897735 s: Alg Init for Layer # -    6
    [C7x_1 ]    486.899068 s: Alg Init for Layer # -    7
    [C7x_1 ]    486.900396 s: Alg Init for Layer # -    8
    [C7x_1 ]    486.900521 s: Alg Init for Layer # -    9
    [C7x_1 ]    486.904487 s: Alg Init for Layer # -   10
    [C7x_1 ]    486.904609 s: Alg Init for Layer # -   11
    [C7x_1 ]    486.908582 s: Alg Init for Layer # -   12
    [C7x_1 ]    486.908708 s: Alg Init for Layer # -   13
    [C7x_1 ]    486.912675 s: Alg Init for Layer # -   14
    [C7x_1 ]    486.912783 s: Alg Init for Layer # -   15
    [C7x_1 ]    486.912850 s: Alg Init for Layer # -   16
    [C7x_1 ]    486.912965 s: Alg Init for Layer # -   17
    [C7x_1 ]    486.913075 s: Alg Init for Layer # -   18
    [C7x_1 ]    486.913176 s: Alg Init for Layer # -   19
    [C7x_1 ]    486.913286 s: Alg Init for Layer # -   20
    [C7x_1 ]    486.913349 s: Alg Init for Layer # -   21
    [C7x_1 ]    486.913445 s: Alg Init for Layer # -   22
    [C7x_1 ]    486.913550 s: Alg Init for Layer # -   23
    [C7x_1 ]    486.913648 s: Alg Init for Layer # -   24
    [C7x_1 ]    486.913754 s: Alg Init for Layer # -   25
    [C7x_1 ]    486.913817 s: Alg Init for Layer # -   26
    [C7x_1 ]    486.913923 s: Alg Init for Layer # -   27
    [C7x_1 ]    486.914032 s: Alg Init for Layer # -   28
    [C7x_1 ]    486.914135 s: Alg Init for Layer # -   29
    [C7x_1 ]    486.914243 s: Alg Init for Layer # -   30
    [C7x_1 ]    486.914306 s: Alg Init for Layer # -   31
    [C7x_1 ]    486.914401 s: Alg Init for Layer # -   32
    [C7x_1 ]    486.914509 s: Alg Init for Layer # -   33
    [C7x_1 ]    486.914610 s: Alg Init for Layer # -   34
    [C7x_1 ]    486.914718 s: Alg Init for Layer # -   35
    [C7x_1 ]    486.914782 s: Alg Init for Layer # -   36
    [C7x_1 ]    486.914883 s: Alg Init for Layer # -   37
    [C7x_1 ]    486.915000 s: Alg Init for Layer # -   38
    [C7x_1 ]    486.915101 s: Alg Init for Layer # -   39
    [C7x_1 ]    486.915209 s: Alg Init for Layer # -   40
    [C7x_1 ]    486.915273 s: Alg Init for Layer # -   41
    [C7x_1 ]    486.915373 s: Alg Init for Layer # -   42
    [C7x_1 ]    486.915478 s: Alg Init for Layer # -   43
    [C7x_1 ]    486.915583 s: Alg Init for Layer # -   44
    [C7x_1 ]    486.915694 s: Alg Init for Layer # -   45
    [C7x_1 ]    486.915758 s: Alg Init for Layer # -   46
    [C7x_1 ]    486.915860 s: Alg Init for Layer # -   47
    [C7x_1 ]    486.915983 s: Alg Init for Layer # -   48
    [C7x_1 ]    486.916085 s: Alg Init for Layer # -   49
    [C7x_1 ]    486.916195 s: Alg Init for Layer # -   50
    [C7x_1 ]    486.916258 s: Alg Init for Layer # -   51
    [C7x_1 ]    486.916359 s: Alg Init for Layer # -   52
    [C7x_1 ]    486.916460 s: Alg Init for Layer # -   53
    [C7x_1 ]    486.916602 s: PREEMPTION: Adding a new priority object for targetPriority = 0, handle = b2026cc0
    [C7x_1 ]    486.916668 s: PREEMPTION: Now total number of priority objects = 1 at priorityId = 0,    with new memRec of base = b54a2b00 and size = 1048576
    [C7x_1 ]    486.916734 s: PREEMPTION: Requesting context memory addr for handle b2026cc0, return Addr = 99de4f10
    [C7x_1 ]    486.916770 s: Print preEmption Hnadle during init stage :
    [C7x_1 ]    486.916799 s: Procsize,      ctxSize,       lyrIdx
    [C7x_1 ]    486.916831 s: 0.000,         7304,            0
    [C7x_1 ]    486.916861 s: 1.005,         7304,            1
    [C7x_1 ]    486.916891 s: 0.223,       731144,            2
    [C7x_1 ]    486.916928 s: 0.250,         7304,            3
    [C7x_1 ]    486.916960 s: 1.226,         7304,            4
    [C7x_1 ]    486.916990 s: 2.427,         7304,            5
    [C7x_1 ]    486.917019 s: 2.427,         7304,            6
    [C7x_1 ]    486.917048 s: 2.427,         7304,            7
    [C7x_1 ]    486.917077 s: 4.536,         7304,            8
    [C7x_1 ]    486.917105 s: 4.855,         7304,            9
    [C7x_1 ]    486.917134 s: 4.536,         7304,           10
    [C7x_1 ]    486.917163 s: 4.858,         7304,           11
    [C7x_1 ]    486.917192 s: 4.536,         7304,           12
    [C7x_1 ]    486.917222 s: 4.855,         7304,           13
    [C7x_1 ]    486.917251 s: 1.703,         7304,           14
    [C7x_1 ]    486.917278 s: 0.002,         7304,           15
    [C7x_1 ]    486.917308 s: 0.223,       731144,           16
    [C7x_1 ]    486.917336 s: 0.250,         7304,           17
    [C7x_1 ]    486.917365 s: 0.223,       731144,           18
    [C7x_1 ]    486.917394 s: 0.250,         7304,           19
    [C7x_1 ]    486.917423 s: 0.002,         7304,           20
    [C7x_1 ]    486.917453 s: 0.223,       731144,           21
    [C7x_1 ]    486.917482 s: 0.250,         7304,           22
    [C7x_1 ]    486.917511 s: 0.223,       731144,           23
    [C7x_1 ]    486.917541 s: 0.250,         7304,           24
    [C7x_1 ]    486.917569 s: 0.002,         7304,           25
    [C7x_1 ]    486.917598 s: 0.223,       731144,           26
    [C7x_1 ]    486.917627 s: 0.250,         7304,           27
    [C7x_1 ]    486.917656 s: 0.223,       731144,           28
    [C7x_1 ]    486.917685 s: 0.250,         7304,           29
    [C7x_1 ]    486.917714 s: 0.002,         7304,           30
    [C7x_1 ]    486.917743 s: 0.223,       731144,           31
    [C7x_1 ]    486.917772 s: 0.250,         7304,           32
    [C7x_1 ]    486.917802 s: 0.223,       731144,           33
    [C7x_1 ]    486.917830 s: 0.250,         7304,           34
    [C7x_1 ]    486.917858 s: 0.002,         7304,           35
    [C7x_1 ]    486.917888 s: 0.223,       731144,           36
    [C7x_1 ]    486.917922 s: 0.250,         7304,           37
    [C7x_1 ]    486.917952 s: 0.223,       731144,           38
    [C7x_1 ]    486.917981 s: 0.250,         7304,           39
    [C7x_1 ]    486.918009 s: 0.002,         7304,           40
    [C7x_1 ]    486.918038 s: 0.223,       731144,           41
    [C7x_1 ]    486.918067 s: 0.250,         7304,           42
    [C7x_1 ]    486.918096 s: 0.223,       731144,           43
    [C7x_1 ]    486.918125 s: 0.250,         7304,           44
    [C7x_1 ]    486.918154 s: 0.002,         7304,           45
    [C7x_1 ]    486.918182 s: 0.223,       731144,           46
    [C7x_1 ]    486.918211 s: 0.250,         7304,           47
    [C7x_1 ]    486.918240 s: 0.223,       731144,           48
    [C7x_1 ]    486.918269 s: 0.250,         7304,           49
    [C7x_1 ]    486.918297 s: 0.002,         7304,           50
    [C7x_1 ]    486.918326 s: 0.223,       731144,           51
    [C7x_1 ]    486.918356 s: 0.140,       731144,           52
    [C7x_1 ]    486.918384 s: 0.474,         7304,           53
    [C7x_1 ]    486.918414 s: 0.000,            0,           54
    [C7x_1 ]    486.918488 s: TIDL_initializeHandleForPreemption is completed 
    [C7x_1 ]    486.959109 s: TIDL_process is started with handle : b2026cc0 
    [C7x_1 ]    486.959148 s: TIDL_activate is called with handle : b2026cc0 
    [C7x_1 ]    486.959240 s: Layer_idx,procTime,ctxSize-total,preEmptLayerId,(int_ctxt_ptr|ext_ctxt_ptr|ctxtSize-part)
    [C7x_1 ]    486.959285 s:        0,0.00000,    7304,      -1
    [C7x_1 ]    486.959336 s: Core 0 Alg Process for Layer # -    1, layer type 29
    [C7x_1 ]    486.959370 s: Processing Layer # -    1
    [C7x_1 ]    486.967177 s: Core 0 End of Layer # -    1 with outPtrs[0] = b20d6c00
    [C7x_1 ]    486.967219 s: Core 0 Alg Process for Layer # -    2, layer type 5
    [C7x_1 ]    486.967251 s: Processing Layer # -    2
    [C7x_1 ]    486.967427 s: Core 0 End of Layer # -    2 with outPtrs[0] = 7e000000
    [C7x_1 ]    486.967469 s: Core 0 Alg Process for Layer # -    3, layer type 5
    [C7x_1 ]    486.967500 s: Processing Layer # -    3
    [C7x_1 ]    486.967721 s: Core 0 End of Layer # -    3 with outPtrs[0] = b2186c80
    [C7x_1 ]    486.967761 s: Core 0 Alg Process for Layer # -    4, layer type 1
    [C7x_1 ]    486.967792 s: Processing Layer # -    4
    [C7x_1 ]    486.968776 s: Core 0 End of Layer # -    4 with outPtrs[0] = b2255700
    [C7x_1 ]    486.968817 s: Core 0 Alg Process for Layer # -    5, layer type 1
    [C7x_1 ]    486.968848 s: Processing Layer # -    5
    [C7x_1 ]    486.972168 s: Core 0 End of Layer # -    5 with outPtrs[0] = b2a06300
    [C7x_1 ]    486.972209 s: Core 0 Alg Process for Layer # -    6, layer type 1
    [C7x_1 ]    486.972240 s: Processing Layer # -    6
    [C7x_1 ]    486.975560 s: Core 0 End of Layer # -    6 with outPtrs[0] = b316ff00
    [C7x_1 ]    486.975601 s: Core 0 Alg Process for Layer # -    7, layer type 1
    [C7x_1 ]    486.975631 s: Processing Layer # -    7
    [C7x_1 ]    486.978951 s: Core 0 End of Layer # -    7 with outPtrs[0] = b38d9b00
    [C7x_1 ]    486.978993 s: Core 0 Alg Process for Layer # -    8, layer type 12
    [C7x_1 ]    486.979024 s: Processing Layer # -    8
    [C7x_1 ]    486.982026 s: Core 0 End of Layer # -    8 with outPtrs[0] = b4043700
    [C7x_1 ]    486.982068 s: Core 0 Alg Process for Layer # -    9, layer type 1
    [C7x_1 ]    486.982099 s: Processing Layer # -    9
    [C7x_1 ]    486.989572 s: Core 0 End of Layer # -    9 with outPtrs[0] = b316ff00
    [C7x_1 ]    486.989612 s: Core 0 Alg Process for Layer # -   10, layer type 12
    [C7x_1 ]    486.989644 s: Processing Layer # -   10
    [C7x_1 ]    486.992643 s: Core 0 End of Layer # -   10 with outPtrs[0] = b38cd300
    [C7x_1 ]    486.992685 s: Core 0 Alg Process for Layer # -   11, layer type 1
    [C7x_1 ]    486.992715 s: Processing Layer # -   11
    [C7x_1 ]    487.000246 s: Core 0 End of Layer # -   11 with outPtrs[0] = b2a06300
    [C7x_1 ]    487.000288 s: Core 0 Alg Process for Layer # -   12, layer type 12
    [C7x_1 ]    487.000318 s: Processing Layer # -   12
    [C7x_1 ]    487.003309 s: Core 0 End of Layer # -   12 with outPtrs[0] = b3163700
    [C7x_1 ]    487.003351 s: Core 0 Alg Process for Layer # -   13, layer type 1
    [C7x_1 ]    487.003380 s: Processing Layer # -   13
    [C7x_1 ]    487.010937 s: Core 0 End of Layer # -   13 with outPtrs[0] = b2255700
    [C7x_1 ]    487.010979 s: Core 0 Alg Process for Layer # -   14, layer type 8
    [C7x_1 ]    487.011009 s: Processing Layer # -   14
    [C7x_1 ]    487.024031 s: Core 0 End of Layer # -   14 with outPtrs[0] = b27db700
    [C7x_1 ]    487.024073 s: Core 0 Alg Process for Layer # -   15, layer type 14
    [C7x_1 ]    487.024103 s: Processing Layer # -   15
    [C7x_1 ]    487.024140 s: Core 0 End of Layer # -   15 with outPtrs[0] = b27db700
    [C7x_1 ]    487.024181 s: Core 0 Alg Process for Layer # -   16, layer type 5
    [C7x_1 ]    487.024211 s: Processing Layer # -   16
    [C7x_1 ]    487.024380 s: Core 0 End of Layer # -   16 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.024421 s: Core 0 Alg Process for Layer # -   17, layer type 5
    [C7x_1 ]    487.024452 s: Processing Layer # -   17
    [C7x_1 ]    487.024676 s: Core 0 End of Layer # -   17 with outPtrs[0] = b2186c80
    [C7x_1 ]    487.024717 s: Core 0 Alg Process for Layer # -   18, layer type 5
    [C7x_1 ]    487.024748 s: Processing Layer # -   18
    [C7x_1 ]    487.024918 s: Core 0 End of Layer # -   18 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.024959 s: Core 0 Alg Process for Layer # -   19, layer type 5
    [C7x_1 ]    487.024990 s: Processing Layer # -   19
    [C7x_1 ]    487.025207 s: Core 0 End of Layer # -   19 with outPtrs[0] = b2255700
    [C7x_1 ]    487.025248 s: Core 0 Alg Process for Layer # -   20, layer type 14
    [C7x_1 ]    487.025278 s: Processing Layer # -   20
    [C7x_1 ]    487.025315 s: Core 0 End of Layer # -   20 with outPtrs[0] = b27db700
    [C7x_1 ]    487.025356 s: Core 0 Alg Process for Layer # -   21, layer type 5
    [C7x_1 ]    487.025387 s: Processing Layer # -   21
    [C7x_1 ]    487.025559 s: Core 0 End of Layer # -   21 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.025600 s: Core 0 Alg Process for Layer # -   22, layer type 5
    [C7x_1 ]    487.025630 s: Processing Layer # -   22
    [C7x_1 ]    487.025853 s: Core 0 End of Layer # -   22 with outPtrs[0] = b2255700
    [C7x_1 ]    487.025894 s: Core 0 Alg Process for Layer # -   23, layer type 5
    [C7x_1 ]    487.025925 s: Processing Layer # -   23
    [C7x_1 ]    487.026096 s: Core 0 End of Layer # -   23 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.026137 s: Core 0 Alg Process for Layer # -   24, layer type 5
    [C7x_1 ]    487.026168 s: Processing Layer # -   24
    [C7x_1 ]    487.026387 s: Core 0 End of Layer # -   24 with outPtrs[0] = b20d6c00
    [C7x_1 ]    487.026428 s: Core 0 Alg Process for Layer # -   25, layer type 14
    [C7x_1 ]    487.026459 s: Processing Layer # -   25
    [C7x_1 ]    487.026496 s: Core 0 End of Layer # -   25 with outPtrs[0] = b27db700
    [C7x_1 ]    487.026536 s: Core 0 Alg Process for Layer # -   26, layer type 5
    [C7x_1 ]    487.026566 s: Processing Layer # -   26
    [C7x_1 ]    487.026738 s: Core 0 End of Layer # -   26 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.026780 s: Core 0 Alg Process for Layer # -   27, layer type 5
    [C7x_1 ]    487.026810 s: Processing Layer # -   27
    [C7x_1 ]    487.027033 s: Core 0 End of Layer # -   27 with outPtrs[0] = b20d6c00
    [C7x_1 ]    487.027074 s: Core 0 Alg Process for Layer # -   28, layer type 5
    [C7x_1 ]    487.027105 s: Processing Layer # -   28
    [C7x_1 ]    487.027276 s: Core 0 End of Layer # -   28 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.027318 s: Core 0 Alg Process for Layer # -   29, layer type 5
    [C7x_1 ]    487.027348 s: Processing Layer # -   29
    [C7x_1 ]    487.027566 s: Core 0 End of Layer # -   29 with outPtrs[0] = b21a5680
    [C7x_1 ]    487.027608 s: Core 0 Alg Process for Layer # -   30, layer type 14
    [C7x_1 ]    487.027638 s: Processing Layer # -   30
    [C7x_1 ]    487.027676 s: Core 0 End of Layer # -   30 with outPtrs[0] = b27db700
    [C7x_1 ]    487.027716 s: Core 0 Alg Process for Layer # -   31, layer type 5
    [C7x_1 ]    487.027747 s: Processing Layer # -   31
    [C7x_1 ]    487.027918 s: Core 0 End of Layer # -   31 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.027959 s: Core 0 Alg Process for Layer # -   32, layer type 5
    [C7x_1 ]    487.027990 s: Processing Layer # -   32
    [C7x_1 ]    487.028212 s: Core 0 End of Layer # -   32 with outPtrs[0] = b21a5680
    [C7x_1 ]    487.028254 s: Core 0 Alg Process for Layer # -   33, layer type 5
    [C7x_1 ]    487.028284 s: Processing Layer # -   33
    [C7x_1 ]    487.028455 s: Core 0 End of Layer # -   33 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.028496 s: Core 0 Alg Process for Layer # -   34, layer type 5
    [C7x_1 ]    487.028527 s: Processing Layer # -   34
    [C7x_1 ]    487.028745 s: Core 0 End of Layer # -   34 with outPtrs[0] = b20d6c00
    [C7x_1 ]    487.028786 s: Core 0 Alg Process for Layer # -   35, layer type 14
    [C7x_1 ]    487.028817 s: Processing Layer # -   35
    [C7x_1 ]    487.028854 s: Core 0 End of Layer # -   35 with outPtrs[0] = b27db700
    [C7x_1 ]    487.028895 s: Core 0 Alg Process for Layer # -   36, layer type 5
    [C7x_1 ]    487.028925 s: Processing Layer # -   36
    [C7x_1 ]    487.029098 s: Core 0 End of Layer # -   36 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.029139 s: Core 0 Alg Process for Layer # -   37, layer type 5
    [C7x_1 ]    487.029169 s: Processing Layer # -   37
    [C7x_1 ]    487.029393 s: Core 0 End of Layer # -   37 with outPtrs[0] = b20d6c00
    [C7x_1 ]    487.029434 s: Core 0 Alg Process for Layer # -   38, layer type 5
    [C7x_1 ]    487.029465 s: Processing Layer # -   38
    [C7x_1 ]    487.029636 s: Core 0 End of Layer # -   38 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.029677 s: Core 0 Alg Process for Layer # -   39, layer type 5
    [C7x_1 ]    487.029708 s: Processing Layer # -   39
    [C7x_1 ]    487.029926 s: Core 0 End of Layer # -   39 with outPtrs[0] = b21a5680
    [C7x_1 ]    487.029967 s: Core 0 Alg Process for Layer # -   40, layer type 14
    [C7x_1 ]    487.029998 s: Processing Layer # -   40
    [C7x_1 ]    487.030034 s: Core 0 End of Layer # -   40 with outPtrs[0] = b27db700
    [C7x_1 ]    487.030075 s: Core 0 Alg Process for Layer # -   41, layer type 5
    [C7x_1 ]    487.030105 s: Processing Layer # -   41
    [C7x_1 ]    487.030278 s: Core 0 End of Layer # -   41 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.030319 s: Core 0 Alg Process for Layer # -   42, layer type 5
    [C7x_1 ]    487.030350 s: Processing Layer # -   42
    [C7x_1 ]    487.030573 s: Core 0 End of Layer # -   42 with outPtrs[0] = b21a5680
    [C7x_1 ]    487.030614 s: Core 0 Alg Process for Layer # -   43, layer type 5
    [C7x_1 ]    487.030644 s: Processing Layer # -   43
    [C7x_1 ]    487.030816 s: Core 0 End of Layer # -   43 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.030857 s: Core 0 Alg Process for Layer # -   44, layer type 5
    [C7x_1 ]    487.030889 s: Processing Layer # -   44
    [C7x_1 ]    487.031107 s: Core 0 End of Layer # -   44 with outPtrs[0] = b20d6c00
    [C7x_1 ]    487.031149 s: Core 0 Alg Process for Layer # -   45, layer type 14
    [C7x_1 ]    487.031178 s: Processing Layer # -   45
    [C7x_1 ]    487.031215 s: Core 0 End of Layer # -   45 with outPtrs[0] = b27db700
    [C7x_1 ]    487.031255 s: Core 0 Alg Process for Layer # -   46, layer type 5
    [C7x_1 ]    487.031285 s: Processing Layer # -   46
    [C7x_1 ]    487.031457 s: Core 0 End of Layer # -   46 with outPtrs[0] = 7e000000
    [C7x_1 ]    487.031499 s: Core 0 Alg Process for Layer # -   47, layer type 5
    [C7x_1 ]    487.031530 s: Processing Layer # -   47
    [C7x_1 ]    487.031754 s: Core 0 End of Layer 

    One thing I noticed is that the final print statements of the custom Python script was executed much earlier (as seen in line 101 of prog_log.txt). What I believe is happening here is that the CPU writes to the output file even before the C7x accelerator can finish processing the image. If this is the case, is there a way to get both the processes in sync?

  • Hi Joel,

    This information / logs are great, thank you for supplying. 

    Your logs look ordinary to me -- I don't see anything concerning in them 

    One thing I noticed is that the final print statements of the custom Python script was executed much earlier (as seen in line 101 of prog_log.txt).

    This is ok. There are multiple processes/threads that are created when running the TIDL inference call in the background. These (plus the python interpreter) may be pushing data into standard-output (stdout) in parallel such that some of the lines are interleaved. This does not indicate any issue, and it just an (unfortunate) side-effect of parallel processing -- after all, there are 4 A53 cores on the AM62A EVM that jointly run the Linux OS.

    The prints you are seeing about Layer cycles and such are from inside the TIDL software stack, and these do not start printing until inference has completed on the AI accelerator (C7x DSP core). Otherwise, printing would slow down the model inference itself and impact performance and any internal measurements

    Regarding your accuracy 

    My first estimation is that your network is not responding well to quantization. TIDL by default uses 8-bit quantization to compute the neural network originally trained and composed of floating point values. Some networks respond worse to this, and a low-light enhancement model could be one of those. 

    These documents have some debugging suggestions and guidance:

    I typically recommend quantizing your model with tensor_bits set to 16 (default to 8). Compile your model in this way, and check if the accuracy is acceptable. It will slow down performance but is a useful check

    Calibration is an important part of quantization as well. Basically, we run images through your network in floating point and collect statistics about intermediate tensors. These help us decide quantization parameters to map what part of the fp32 range to an integer range (8 or 16 bit). The images used here are important, because they influence the resulting statistics. Your model is built for low-light, so using a well-lit image is probably not hitting the right activation ranges. 

    BR,
    Reese

  • Hello Reese,

    Thanks for the suggestions. We changed the images used for calibration while compiling the model. 

    We were finally able to get the output from the model using the same custom script. Two things we noticed:

    1. The output from the model was actually an array of double normalized values (the pixel_value/255 and again divided by 255), because of which, the line
       
      output_image_data = (output_data[0] * 255).clip(0, 255).astype(np.uint8)

      Was actually writing only zeros to the output when it was converting the data to uint8 type. We had to make the following change to get an output image. We are not sure why this is the case.
      output_image_data = (output_data[0] * 255 * 255).clip(0, 255).astype(np.uint8)
    2. This is the original image:

      This is the processed image:

      There is a weird bright strip in the output. Any idea why this is happening?
  • Hello,

    This is progress!

    I agree it is strange you need to multiply by 255 again... That seems wrong and may be why some parts of the image are saturated. Sounds like the output of the network included normalization already and that is the reason why you must do this. I can see how this might be detrimental to accuracy, especially since the data has to go through dequantization already. Any floating point error may be made worse in this scenario.

    Some questions to help push you in the right direction:

    If you run on CPU (don't target TIDL delegate), what does the output result look like?

    • Does that similarly require this additional normalization?
    • Does that output look good, or is it similar to what you see when offloading to TIDL?

    Are you running this network in 16-bit mode now or still in 8 bit mode?

    Do other input images similarly have this row-wise effect?

    BR,
    Reese

  • Hello,

    The model was running on an 8-bit network. I did try to make it 16-bit, but the program just hangs. The log generated by vx_app_arm_remote_log.out running in the background:

    4477.remote_log.txt
    [MCU1_0]    244.058687 s: CIO: Init ... Done !!!
    [MCU1_0]    244.058744 s: APP: Init ... !!!
    [MCU1_0]    244.058763 s: MEM: Init ... !!!
    [MCU1_0]    244.058781 s: MEM: Created heap (DDR_LOCAL_MEM, id=0, flags=0x00000004) @ af000000 of size 16777216 bytes !!!
    [MCU1_0]    244.058835 s: MEM: Init ... Done !!!
    [MCU1_0]    244.058851 s: IPC: Init ... !!!
    [MCU1_0]    244.058867 s: IPC: 3 CPUs participating in IPC !!!
    [MCU1_0]    244.059172 s: IPC: Waiting for HLOS to be ready ... !!!
    [MCU1_0]    244.063498 s: Sciserver Version: v2023.11.0.0REL.MCUSDK.MM.NN.PP.bb
    [MCU1_0]    244.066271 s: ##RM_PM_HAL Version: vMM.NN.PP
    [MCU1_0]    244.069218 s: ##Starting Sciserver..... PASSED
    [MCU1_0]    257.824796 s: IPC: HLOS is ready !!!
    [MCU1_0]    257.824876 s: IPC: Init ... Done !!!
    [MCU1_0]    257.824901 s: APP: Syncing with 2 CPUs ... !!!
    [MCU1_0]    257.824925 s: APP: Syncing with 2 CPUs ... Done !!!
    [MCU1_0]    257.824944 s: REMOTE_SERVICE: Init ... !!!
    [MCU1_0]    257.825015 s: REMOTE_SERVICE: Init ... Done !!!
    [MCU1_0]    257.825041 s: FVID2: Init ... !!!
    [MCU1_0]    257.825069 s: FVID2: Init ... Done !!!
    [MCU1_0]    257.825086 s: VHWA: VPAC Init ... !!!
    [MCU1_0]    257.825102 s: SCICLIENT: Sciclient_pmSetModuleState module=219 state=2
    [MCU1_0]    257.825185 s: SCICLIENT: Sciclient_pmSetModuleState success
    [MCU1_0]    257.825355 s: VHWA: LDC Init ... !!!
    [MCU1_0]    257.825456 s: VHWA: LDC Init ... Done !!!
    [MCU1_0]    257.825479 s: VHWA: MSC Init ... !!!
    [MCU1_0]    257.826038 s: VHWA: MSC Init ... Done !!!
    [MCU1_0]    257.826100 s: VHWA: VISS Init ... !!!
    [MCU1_0]    257.827028 s: VHWA: VISS Init ... Done !!!
    [MCU1_0]    257.827059 s: VHWA: VPAC Init ... Done !!!
    [MCU1_0]    257.827089 s:  VX_ZONE_INIT:Enabled
    [MCU1_0]    257.827109 s:  VX_ZONE_ERROR:Enabled
    [MCU1_0]    257.827126 s:  VX_ZONE_WARNING:Enabled
    [MCU1_0]    257.827940 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target MCU1-0 
    [MCU1_0]    257.828019 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target VPAC_LDC1 
    [MCU1_0]    257.828095 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target VPAC_MSC1 
    [MCU1_0]    257.828162 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target VPAC_MSC2 
    [MCU1_0]    257.828227 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target VPAC_VISS1 
    [MCU1_0]    257.828259 s:  VX_ZONE_INIT:[tivxInitLocal:136] Initialization Done !!!
    [MCU1_0]    257.828280 s: APP: OpenVX Target kernel init ... !!!
    [MCU1_0]    257.832789 s: APP: OpenVX Target kernel init ... Done !!!
    [MCU1_0]    257.832817 s: VISS REMOTE SERVICE: Init ... !!!
    [MCU1_0]    257.832862 s: VISS REMOTE SERVICE: Init ... Done !!!
    [MCU1_0]    257.832882 s: APP: Init ... Done !!!
    [MCU1_0]    257.832899 s: APP: Run ... !!!
    [MCU1_0]    257.832913 s: IPC: Starting echo test ...
    [MCU1_0]    257.833026 s: APP: Run ... Done !!!
    [MCU1_0]    257.833428 s: IPC: Echo status: a530-0[.] r5f0-0[s] c75ss0[P] 
    [C7x_1 ]    249.862325 s: CIO: Init ... Done !!!
    [C7x_1 ]    249.862344 s: APP: Init ... !!!
    [C7x_1 ]    249.862358 s: SCICLIENT: Init ... !!!
    [C7x_1 ]    249.862419 s: SCICLIENT: DMSC FW version [10.0.8--v10.00.08 (Fiery Fox)]
    [C7x_1 ]    249.862438 s: SCICLIENT: DMSC FW revision 0xa  
    [C7x_1 ]    249.862453 s: SCICLIENT: DMSC FW ABI revision 4.0
    [C7x_1 ]    249.862469 s: SCICLIENT: Init ... Done !!!
    [C7x_1 ]    249.862483 s: UDMA: Init ... !!!
    [C7x_1 ]    249.862496 s: UDMA: Init ... Done !!!
    [C7x_1 ]    249.862508 s: MEM: Init ... !!!
    [C7x_1 ]    249.862522 s: MEM: Created heap (DDR_LOCAL_MEM, id=0, flags=0x00000004) @ b2000000 of size 117440512 bytes !!!
    [C7x_1 ]    249.862550 s: MEM: Init ... Done !!!
    [C7x_1 ]    249.862562 s: IPC: Init ... !!!
    [C7x_1 ]    249.862576 s: IPC: 3 CPUs participating in IPC !!!
    [C7x_1 ]    249.862776 s: IPC: Waiting for HLOS to be ready ... !!!
    [C7x_1 ]    257.328321 s: IPC: HLOS is ready !!!
    [C7x_1 ]    257.328395 s: IPC: Init ... Done !!!
    [C7x_1 ]    257.328409 s: APP: Syncing with 2 CPUs ... !!!
    [C7x_1 ]    257.824927 s: APP: Syncing with 2 CPUs ... Done !!!
    [C7x_1 ]    257.824944 s: REMOTE_SERVICE: Init ... !!!
    [C7x_1 ]    257.825348 s: REMOTE_SERVICE: Init ... Done !!!
    [C7x_1 ]    257.825371 s:  VX_ZONE_INIT:Enabled
    [C7x_1 ]    257.825388 s:  VX_ZONE_ERROR:Enabled
    [C7x_1 ]    257.825404 s:  VX_ZONE_WARNING:Enabled
    [C7x_1 ]    257.826127 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1 
    [C7x_1 ]    257.826238 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_2 
    [C7x_1 ]    257.826352 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_3 
    [C7x_1 ]    257.826464 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_4 
    [C7x_1 ]    257.826558 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_5 
    [C7x_1 ]    257.826653 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_6 
    [C7x_1 ]    257.826745 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_7 
    [C7x_1 ]    257.826838 s:  VX_ZONE_INIT:[tivxPlatformCreateTargetId:124] Added target DSP_C7-1_PRI_8 
    [C7x_1 ]    257.826865 s:  VX_ZONE_INIT:[tivxInitLocal:136] Initialization Done !!!
    [C7x_1 ]    257.826882 s: APP: OpenVX Target kernel init ... !!!
    [C7x_1 ]    257.827088 s: APP: OpenVX Target kernel init ... Done !!!
    [C7x_1 ]    257.827104 s: APP: Init ... Done !!!
    [C7x_1 ]    257.827117 s: APP: Run ... !!!
    [C7x_1 ]    257.827129 s: IPC: Starting echo test ...
    [C7x_1 ]    257.827224 s: APP: Run ... Done !!!
    [C7x_1 ]    257.833509 s: IPC: Echo status: a530-0[.] r5f0-0[P] c75ss0[s] 
    [C7x_1 ]    499.108224 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.129281 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.150335 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.171744 s: PREEMPTION: Requesting memory of size 1048576 for targetPriority = 0
    [C7x_1 ]    499.171771 s: 
    [C7x_1 ]    499.171793 s: --------------------------------------------
    [C7x_1 ]    499.171821 s: TIDL Memory size requiement (record wise):
    [C7x_1 ]    499.171868 s: MemRecNum   , Space               , Attribute   , Alignment   , Size(KBytes), BasePtr     
    [C7x_1 ]    499.171921 s: 0           , DDR Cacheable       , Persistent  ,  128, 19.27   , 0x00000000
    [C7x_1 ]    499.171971 s: 1           , DDR Cacheable       , Persistent  ,  128, 0.65    , 0x00000000
    [C7x_1 ]    499.172020 s: 2           , L1D                 , Scratch     ,  128, 16.00   , 0x00000000
    [C7x_1 ]    499.172070 s: 3           , L2                  , Scratch     ,  128, 224.00  , 0x00000000
    [C7x_1 ]    499.172119 s: 4           , L3/MSMC             , Scratch     ,  128, 1024.00 , 0x00000000
    [C7x_1 ]    499.172168 s: 5           , DDR Cacheable       , Persistent  ,  128, 1640.28 , 0x00000000
    [C7x_1 ]    499.172218 s: 6           , DDR Cacheable       , Scratch     ,  128, 6.88    , 0x00000000
    [C7x_1 ]    499.172267 s: 7           , DDR Cacheable       , Persistent  ,  128, 98800.50, 0x00000000
    [C7x_1 ]    499.172316 s: 8           , DDR Cacheable       , Scratch     ,  128, 0.13    , 0x00000000
    [C7x_1 ]    499.172366 s: 9           , DDR Cacheable       , Scratch     ,  128, 3.13    , 0x00000000
    [C7x_1 ]    499.172415 s: 10          , DDR Cacheable       , Persistent  ,  128, 747.55  , 0x00000000
    [C7x_1 ]    499.172464 s: 11          , DDR Cacheable       , Scratch     ,  128, 512.25  , 0x00000000
    [C7x_1 ]    499.172514 s: 12          , DDR Cacheable       , Persistent  ,  128, 1024.00 , 0x00000000
    [C7x_1 ]    499.172563 s: 13          , DDR Cacheable       , Persistent  ,  128, 1429.80 , 0x00000000
    [C7x_1 ]    499.172624 s: 14          , DDR Cacheable       , Persistent  ,  128, 0.00    , 0x00000000
    [C7x_1 ]    499.172674 s: 15          , DDR Cacheable       , Persistent  ,  128, 156.75  , 0x00000000
    [C7x_1 ]    499.172709 s: --------------------------------------------
    [C7x_1 ]    499.172738 s: Total memory size requirement (space wise):
    [C7x_1 ]    499.172762 s: Mem Space , Size(KBytes)
    [C7x_1 ]    499.172785 s: L1D       , 16.00   
    [C7x_1 ]    499.172808 s: L2        , 224.00  
    [C7x_1 ]    499.172830 s: L3/MSMC   , 1024.00 
    [C7x_1 ]    499.172855 s: DDR Cacheable, 104341.18
    [C7x_1 ]    499.172882 s: --------------------------------------------
    [C7x_1 ]    499.172922 s: NOTE: Memory requirement in host emulation can be different from the same on EVM
    [C7x_1 ]    499.172965 s:       To get the actual TIDL memory requirement make sure to run on EVM with 
    [C7x_1 ]    499.172992 s:       debugTraceLevel = 2
    [C7x_1 ]    499.173005 s: 
    [C7x_1 ]    499.173028 s: --------------------------------------------
    [C7x_1 ]    499.178949 s: TIDL init call from ivision API 
    [C7x_1 ]    499.178969 s: 
    [C7x_1 ]    499.178989 s: --------------------------------------------
    [C7x_1 ]    499.179018 s: TIDL Memory size requiement (record wise):
    [C7x_1 ]    499.179062 s: MemRecNum   , Space               , Attribute   , Alignment   , Size(KBytes), BasePtr     
    [C7x_1 ]    499.179115 s: 0           , DDR Cacheable       , Persistent  ,  128, 19.27   , 0xb2026cc0
    [C7x_1 ]    499.179165 s: 1           , DDR Cacheable       , Persistent  ,  128, 0.65    , 0xb202ba80
    [C7x_1 ]    499.179214 s: 2           , L1D                 , Scratch     ,  128, 16.00   , 0x7f03c000
    [C7x_1 ]    499.179263 s: 3           , L2                  , Scratch     ,  128, 224.00  , 0x7f000000
    [C7x_1 ]    499.179311 s: 4           , L3/MSMC             , Scratch     ,  128, 1024.00 , 0x7e000000
    [C7x_1 ]    499.179360 s: 5           , DDR Cacheable       , Persistent  ,  128, 1640.28 , 0xb202bdc0
    [C7x_1 ]    499.179409 s: 6           , DDR Cacheable       , Scratch     ,  128, 6.88    , 0xb9000000
    [C7x_1 ]    499.179458 s: 7           , DDR Cacheable       , Persistent  ,  128, 98800.50, 0xb21c5f80
    [C7x_1 ]    499.179506 s: 8           , DDR Cacheable       , Scratch     ,  128, 0.13    , 0xb9001c00
    [C7x_1 ]    499.179555 s: 9           , DDR Cacheable       , Scratch     ,  128, 3.13    , 0xb9002000
    [C7x_1 ]    499.179613 s: 10          , DDR Cacheable       , Persistent  ,  128, 747.55  , 0xb82421c0
    [C7x_1 ]    499.179664 s: 11          , DDR Cacheable       , Scratch     ,  128, 512.25  , 0xb9003000
    [C7x_1 ]    499.179714 s: 12          , DDR Cacheable       , Persistent  ,  128, 1024.00 , 0xb82fd080
    [C7x_1 ]    499.179764 s: 13          , DDR Cacheable       , Persistent  ,  128, 1429.80 , 0xb83fd0c0
    [C7x_1 ]    499.179812 s: 14          , DDR Cacheable       , Persistent  ,  128, 0.00    , 0xb8562880
    [C7x_1 ]    499.179862 s: 15          , DDR Cacheable       , Persistent  ,  128, 156.75  , 0xb8562940
    [C7x_1 ]    499.179896 s: --------------------------------------------
    [C7x_1 ]    499.179925 s: Total memory size requirement (space wise):
    [C7x_1 ]    499.179949 s: Mem Space , Size(KBytes)
    [C7x_1 ]    499.179972 s: L1D       , 16.00   
    [C7x_1 ]    499.179995 s: L2        , 224.00  
    [C7x_1 ]    499.180018 s: L3/MSMC   , 1024.00 
    [C7x_1 ]    499.180042 s: DDR Cacheable, 104341.18
    [C7x_1 ]    499.180069 s: --------------------------------------------
    [C7x_1 ]    499.180109 s: NOTE: Memory requirement in host emulation can be different from the same on EVM
    [C7x_1 ]    499.180152 s:       To get the actual TIDL memory requirement make sure to run on EVM with 
    [C7x_1 ]    499.180180 s:       debugTraceLevel = 2
    [C7x_1 ]    499.180192 s: 
    [C7x_1 ]    499.180214 s: --------------------------------------------
    [C7x_1 ]    499.184285 s: Alg Init for Layer # -    1
    [C7x_1 ]    499.184397 s: Alg Init for Layer # -    2
    [C7x_1 ]    499.184508 s: Alg Init for Layer # -    3
    [C7x_1 ]    499.184627 s: Alg Init for Layer # -    4
    [C7x_1 ]    499.185262 s: Alg Init for Layer # -    5
    [C7x_1 ]    499.189437 s: Alg Init for Layer # -    6
    [C7x_1 ]    499.193600 s: Alg Init for Layer # -    7
    [C7x_1 ]    499.197754 s: Alg Init for Layer # -    8
    [C7x_1 ]    499.197883 s: Alg Init for Layer # -    9
    [C7x_1 ]    499.197957 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.219007 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.239980 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.261052 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.282161 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.303603 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.326572 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.347615 s: Alg Init for Layer # -   10
    [C7x_1 ]    499.347744 s: Alg Init for Layer # -   11
    [C7x_1 ]    499.347819 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.368869 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.389840 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.410906 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.432013 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.453438 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.476336 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.497363 s: Alg Init for Layer # -   12
    [C7x_1 ]    499.497493 s: Alg Init for Layer # -   13
    [C7x_1 ]    499.497569 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.518623 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.539592 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.560666 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.581772 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.603173 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.626089 s: WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow!
    [C7x_1 ]    499.647115 s: Alg Init for Layer # -   14
    [C7x_1 ]    499.647229 s: Alg Init for Layer # -   15
    [C7x_1 ]    499.647295 s: Alg Init for Layer # -   16
    [C7x_1 ]    499.647405 s: Alg Init for Layer # -   17
    [C7x_1 ]    499.647511 s: Alg Init for Layer # -   18
    [C7x_1 ]    499.647624 s: Alg Init for Layer # -   19
    [C7x_1 ]    499.647732 s: Alg Init for Layer # -   20
    [C7x_1 ]    499.647796 s: Alg Init for Layer # -   21
    [C7x_1 ]    499.647902 s: Alg Init for Layer # -   22
    [C7x_1 ]    499.648012 s: Alg Init for Layer # -   23
    [C7x_1 ]    499.648122 s: Alg Init for Layer # -   24
    [C7x_1 ]    499.648231 s: Alg Init for Layer # -   25
    [C7x_1 ]    499.648295 s: Alg Init for Layer # -   26
    [C7x_1 ]    499.648401 s: Alg Init for Layer # -   27
    [C7x_1 ]    499.648510 s: Alg Init for Layer # -   28
    [C7x_1 ]    499.648625 s: Alg Init for Layer # -   29
    [C7x_1 ]    499.648737 s: Alg Init for Layer # -   30
    [C7x_1 ]    499.648803 s: Alg Init for Layer # -   31
    [C7x_1 ]    499.648911 s: Alg Init for Layer # -   32
    [C7x_1 ]    499.649025 s: Alg Init for Layer # -   33
    [C7x_1 ]    499.649134 s: Alg Init for Layer # -   34
    [C7x_1 ]    499.649246 s: Alg Init for Layer # -   35
    [C7x_1 ]    499.649310 s: Alg Init for Layer # -   36
    [C7x_1 ]    499.649416 s: Alg Init for Layer # -   37
    [C7x_1 ]    499.649524 s: Alg Init for Layer # -   38
    [C7x_1 ]    499.649646 s: Alg Init for Layer # -   39
    [C7x_1 ]    499.649760 s: Alg Init for Layer # -   40
    [C7x_1 ]    499.649824 s: Alg Init for Layer # -   41
    [C7x_1 ]    499.649935 s: Alg Init for Layer # -   42
    [C7x_1 ]    499.650047 s: Alg Init for Layer # -   43
    [C7x_1 ]    499.650158 s: Alg Init for Layer # -   44
    [C7x_1 ]    499.650272 s: Alg Init for Layer # -   45
    [C7x_1 ]    499.650334 s: Alg Init for Layer # -   46
    [C7x_1 ]    499.650444 s: Alg Init for Layer # -   47
    [C7x_1 ]    499.650552 s: Alg Init for Layer # -   48
    [C7x_1 ]    499.650675 s: Alg Init for Layer # -   49
    [C7x_1 ]    499.650789 s: Alg Init for Layer # -   50
    [C7x_1 ]    499.650853 s: Alg Init for Layer # -   51
    [C7x_1 ]    499.650963 s: Alg Init for Layer # -   52
    [C7x_1 ]    499.651078 s: Alg Init for Layer # -   53
    [C7x_1 ]    499.651220 s: PREEMPTION: Adding a new priority object for targetPriority = 0, handle = b2026cc0
    [C7x_1 ]    499.651287 s: PREEMPTION: Now total number of priority objects = 1 at priorityId = 0,    with new memRec of base = b82fd080 and size = 1048576
    [C7x_1 ]    499.651353 s: PREEMPTION: Requesting context memory addr for handle b2026cc0, return Addr = 99de4f10
    [C7x_1 ]    499.651389 s: Print preEmption Hnadle during init stage :
    [C7x_1 ]    499.651419 s: Procsize,      ctxSize,       lyrIdx
    [C7x_1 ]    499.651451 s: 0.000,         7304,            0
    [C7x_1 ]    499.651482 s: 1.995,         7304,            1
    [C7x_1 ]    499.651511 s: 0.648,         7304,            2
    [C7x_1 ]    499.651540 s: 0.648,         7304,            3
    [C7x_1 ]    499.651570 s: 2.478,         7304,            4
    [C7x_1 ]    499.651600 s: 9.599,         7304,            5
    [C7x_1 ]    499.651636 s: 9.599,         7304,            6
    [C7x_1 ]    499.651667 s: 9.599,         7304,            7
    [C7x_1 ]    499.651697 s: 9.060,         7304,            8
    [C7x_1 ]    499.651728 s: 33.840,         7304,            9
    [C7x_1 ]    499.651758 s: 9.060,         7304,           10
    [C7x_1 ]    499.651787 s: 33.840,         7304,           11
    [C7x_1 ]    499.651817 s: 9.060,         7304,           12
    [C7x_1 ]    499.651846 s: 31.644,         7304,           13
    [C7x_1 ]    499.651875 s: 3.399,         7304,           14
    [C7x_1 ]    499.651904 s: 0.002,         7304,           15
    [C7x_1 ]    499.651934 s: 0.648,         7304,           16
    [C7x_1 ]    499.651963 s: 0.648,         7304,           17
    [C7x_1 ]    499.651992 s: 0.648,         7304,           18
    [C7x_1 ]    499.652021 s: 0.648,         7304,           19
    [C7x_1 ]    499.652050 s: 0.002,         7304,           20
    [C7x_1 ]    499.652079 s: 0.648,         7304,           21
    [C7x_1 ]    499.652108 s: 0.648,         7304,           22
    [C7x_1 ]    499.652137 s: 0.648,         7304,           23
    [C7x_1 ]    499.652165 s: 0.648,         7304,           24
    [C7x_1 ]    499.652194 s: 0.002,         7304,           25
    [C7x_1 ]    499.652223 s: 0.648,         7304,           26
    [C7x_1 ]    499.652251 s: 0.648,         7304,           27
    [C7x_1 ]    499.652280 s: 0.648,         7304,           28
    [C7x_1 ]    499.652309 s: 0.648,         7304,           29
    [C7x_1 ]    499.652338 s: 0.002,         7304,           30
    [C7x_1 ]    499.652366 s: 0.648,         7304,           31
    [C7x_1 ]    499.652395 s: 0.648,         7304,           32
    [C7x_1 ]    499.652424 s: 0.648,         7304,           33
    [C7x_1 ]    499.652452 s: 0.648,         7304,           34
    [C7x_1 ]    499.652480 s: 0.002,         7304,           35
    [C7x_1 ]    499.652509 s: 0.648,         7304,           36
    [C7x_1 ]    499.652538 s: 0.648,         7304,           37
    [C7x_1 ]    499.652567 s: 0.648,         7304,           38
    [C7x_1 ]    499.652596 s: 0.648,         7304,           39
    [C7x_1 ]    499.652630 s: 0.002,         7304,           40
    [C7x_1 ]    499.652660 s: 0.648,         7304,           41
    [C7x_1 ]    499.652690 s: 0.648,         7304,           42
    [C7x_1 ]    499.652718 s: 0.648,         7304,           43
    [C7x_1 ]    499.652748 s: 0.648,         7304,           44
    [C7x_1 ]    499.652776 s: 0.002,         7304,           45
    [C7x_1 ]    499.652805 s: 0.648,         7304,           46
    [C7x_1 ]    499.652834 s: 0.648,         7304,           47
    [C7x_1 ]    499.652864 s: 0.648,         7304,           48
    [C7x_1 ]    499.652893 s: 0.648,         7304,           49
    [C7x_1 ]    499.652922 s

    When running the model on the CPU, the output data is neat and is normalized only once.

    Best Regards,
    Joel Jojo

  • Hi Joel,

    Thanks for the info and logs here. The CPU version looks good, so probably related to quantization.

    If the original version did not require an additional *255, then it is effectively a bandaid over the real issue to do this on the final output.

    Could you provide me your current set of artifacts? I'm curious what these layers 49/50 are -- it looks like it hangs during one of those layers. I see lots of warnings coming from convolution layers, but you don't experience a hang on those. Looks like this is happening during initialization of the model, and it has not reached the inference stage yet. 

    Are you running this on the actual AM62A target? We often recommend running in a PC emulation mode when debugging accuracy but this is not required. Many find it more convenient to do this debugging on PC.

    BR,
    Reese

  • Hello Reese,

    The model artifacts generated with tensor bits set to 16:

    zero_dce_model_no_sq_16.zip

    The model is being used on the actual AM62A target. Is there a way to emulate C7x accelerator on PC, because as far as I have seen, running the code on PC emulates running the code on the ARM core only.

  • Hi Joel,

    Can you also provide me the model_config entry for your model, or at least how you are compiling the model? I am not able to reproduce what you are running into. I do see a seg fault during compilation on my side during calibration, which may result from not having the right preprocessing parameters (nominally mean and scale values).

    I tried using the below but did not get reasonable output

        'zero_dce_model_no_sq':
        {
           'model_path' : os.path.join(models_base_path, 'zero_dce_model_no_sq.tflite'),
           'source' : {'model_url': 'dummy', 'opt': True,  'infer_shape' : True},
           'mean': [0,0,0],
           'scale' : [1,1,1],
           'num_images' : numImages,
           'num_classes': 1000,
           'session_name' : 'tflitert' ,
           'model_type': 'low-light',
           'optional_options': { 
             'debug_level': 1,
             'tensor_bits': 8
           }
        },

    And I added some code to tflrt_delegate.py to handle this 'model_type' set to 'low-light but adding your image and postprocessing code (mul by 255 and clip to uint8). Running on CPU with this config does not produce your output, and running with TIDL gives poorer results. Compilation fails on my side (TIDL_TOOLS 10_00_06_00) for tensor_bits 8 and 16, and looks to happen during calibration for fixed point execution

    Is there a way to emulate C7x accelerator on PC, because as far as I have seen, running the code on PC emulates running the code on the ARM core only.

    Yes, the same code on PC (with appropriately set TIDL_TOOLS_PATH environment variable, which is necessary for compilation too) should run on C7x emulator for PC. If your model is running only on CPU for PC, then probably the model artifacts were not correctly generated. 

  • Hello Reese,

    We modified one of the model configurations in the model_configs.py file in the TIDL tool and then used the tflrt_delegate.py script to compile it.

    "zero_dce_model_no_sq": create_model_config(
        source=AttrDict(
            model_url="",
        ),
        preprocess=AttrDict(
            resize=(600,400),
            crop=(600,400),
            data_layout="NHWC",
            resize_with_pad=False,
            reverse_channels=False,
        ),
        session=AttrDict(
            session_name="tflitert",
            model_path=os.path.join(models_base_path, "zero_dce_model_no_sq.tflite"),
            input_mean=[0, 0, 0],
            input_scale=[1 / 255, 1 / 255, 1 / 255],
            input_optimization=True,
        ),
        task_type="custom",
        extra_info=AttrDict(num_images=3, num_classes=0),
    ),

    We changed the tensor bits by directly modifying the variable in common_utils.py

    Regards,
    Joel

  • Hi Joel,

    Thanks for passing this configuration, I am able to reproduce your issue with more confidence now. 

    I haven't yet found a workable solution yet. I see good accuracy from CPU-based execution on PC with your config. Once I include TIDL, I see black images (all output pixels=0) for 8, 16, and 32-bit artifacts when emulating them on PC to check accuracy -- this is not typical, and I have some suspicion that there is a bug. The output from the model itself is low-valued floating points, which you've seen.

    I had a suspicion that the TanH activation or the strided slices may be causing something odd. This was not true, but the accuracy degradation seems to be happening in the layers between StridedSlice and the output.

    • When I force the TanH layer and everything thereafter to CPU (with preceding layers emulating the acclerator), the output looks reasonable (visual features all present, but not as bright as your original image). Same is true if I let the network accerlate Convs and TanH, but force layers after StridedSlice to run on CPU.

    This tells me it's the multitude of elementwise (add, mul, sub, etc.) layers that are probably causing the accuracy issue here. If you run the model with debug_level=3, it will produce traces under the /tmp directory. These can be inspected by loading the binary files through a tool liike numpy

    data_last_og = np.fromfile('/tmp/tidl_trace_75_0049_0001_0001_00001_00003_00600x00400.y', dtype=np.int8)  

    Where the filename is depends on how the graph was compiled. You will get one .y output file for each layer that TIDL runs. When I look at these, I see that the values are not zero'd -- they still contain information, and a cursory read->reshape->transpose->write to file showed visual features that make sense (albeit dark). The degree of quantization (8 or 16 bit) is not impacting this substantially.

    The outputs in float seem far too small, but I don't see a better approach at this time than simply duplicating your *255 scaling. My apologies for pushing you in a different direction in a recent comment. I think this constitutes a bug somewhere in our stack, probably with some scaling within the 32-bit flow that is used to collect ranges of tensor values during compilation. 

    Therefore, my suggestion:

    • Continue as you are with the scaling of outputs by 255*255
    • I will collect some info here and log as a bug -- I think it's related to elementwise layers and some the scaling being used

    As an aside, please note that elementwise layers are not particularly efficient (both in terms of compute utilization and memory), especially given the size and count of these elementwise layers operating on 3x600x400 inputs. These layers are critical to your model so I won't give direct suggestions on replacements. Convolutions and matrix multiplications will be far more efficient.

  • Hello Reese,

    We replaced the strided slice operations in our source code with convolution and reshape operations. We were able to get some better results with this change.

    Here is the source code of the model: https://colab.research.google.com/drive/17GckQm8LXURnkJDEqVYcAtLJqlShDYYb?usp=sharing

    Any suggestions regarding the source code from your end?

    Best Regards,
    Joel Jojo

  • Hello Joel,

    No direct feedback/suggestions on the model source code at this moment.

    I'm curious about the fact that output has these very obvious horizontal strips of different lighting 'profiles', so to say. What is the vertical size of that strip? I believe you have 600x400 images, so 6 strips would take up 66 rows each, but image looks like there are a few odd rows out. More likely it's 64 rows that have this behavior, and we're experiencing some bug in the elementwise layers

    • If we want to see if this comes from specific layers or part of the network, you can use the 'deny_list' setting to prevent some layers from running with TIDL acceleration
      • you can pair with max_num_subgraphs setting to prevent layers after the denied layers from running with TIDL either.
      • My recommended strategy here is to deny a layer and set max_num_subgraphs=1 so that the rest of the network runs on Arm. From what layer onward do you see these odd horizontal bands?

    Do you get the same result here for quantizing to tensor_bits=32? Technically this doesn't use any quantization, and instead uses the reference flow based on natural C code (rather than optimized code with accelerator intrinsics). 

    I'm also curious if you see the same behavior for other input images. I assume you will.

    BR,
    Reese