SK-AM62A-LP: how to run yolov8/v11 in this board

Part Number: SK-AM62A-LP
Other Parts Discussed in Thread: TDA4VM

Tool/software:

hello,i referenced edgeai-tidl-tools/docs/custom_model_evaluation.md at master · TexasInstruments/edgeai-tidl-tools · GitHub   edgeai-tidl-tools/examples/osrt_python/README.md at master · TexasInstruments/edgeai-tidl-tools · GitHub

compile in x86 pc

i tried to add new  arguments  in     edgeai-tidl-tools/examples/orst_python/model_configs.py

models_configs = {
############ onnx models ##########
'yolov8n-ori' : {
'model_path' : os.path.join(models_base_path, 'yolov8n.onnx'),
'mean': [0, 0, 0],
'scale' : [0,0,0],
'num_images' : numImages,
'num_classes': 80,
'model_type': 'detection',

#'od_type' : 'YoloV',
'session_name' : 'onnxrt' ,
'framework' : ''
},

}

edgeai-tidl-tools/examples/orst_python/ort/onnxrt_ep.py的

models = [
#"cl-ort-resnet18-v1",
# "od-ort-ssd-lite_mobilenetv2_fpn"
'yolov8n-ori'
]

Here's the error:

====================================================================================================

Command : python3 tflrt_delegate.py in Dir : examples/osrt_python/tfl Started
Running 0 Models - []

Command : python3 onnxrt_ep.py in Dir : examples/osrt_python/ort Started
Available execution providers : ['TIDLExecutionProvider', 'TIDLCompilationProvider', 'CPUExecutionProvider']

Running 1 Models - ['yolov8n-ori']

Process Process-1:
Traceback (most recent call last):
File "/home/zxb/.pyenv/versions/3.10.15/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/home/zxb/.pyenv/versions/3.10.15/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/zxb/Desktop/ti/edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py", line 221, in run_model
download_model(models_configs, model)
File "/home/zxb/Desktop/ti/edgeai-tidl-tools/examples/osrt_python/common_utils.py", line 240, in download_model
model_path = models_configs[model_name]["session"]["model_path"]
KeyError: 'session'

Running_Model : yolov8n-ori

============================================================================================================================

can i  trouble ti  to give some examples of yolov8,thanks

  • Hello,

    It looks like the current error is that it is not finding your model file "yolov8n.onnx'", so it is trying to download the model and not finding the right model_config information. 

    Which commit / tag are you using for edgeai-tidl-tools? The format of model_configs.py changed from release tag 10_00_06_00 to 10_00_08_00. I see you are using a model config in the 10_00_06 style

    Where did you source your YOLOV8 from? This model is handled abnormally because the original implementation has an unfriendly license such that we cannot post any versions of it. We also require some modifications to the original architecture for it to accelerate well on our AI accelerator (C7xMMA). We offer some guidance on how to get a TI-lite / C7x-MMA friendly version. Please see here: https://github.com/TexasInstruments/edgeai-tensorlab/issues/7.

    • This does mean that I don't have any examples that can be directly shared due to licensing restrictions on YOLOv8
  • Hello,

    i followed u advice to use edgeai-optimization,when using PTC example ,FX Graph Mode Quantization is in maintenance mode.

    Since it is our first time facing this type of model deployment which uses artifacts,can u give some advice or  steps suggestion.

    Our previous projects all used YOLOV8.Should i use  trained v8model to optimization (PTQ /QAT PTC).Or use edgeai-yolox/edgeai-mmdetetion to retrain a new v8-ti-lite model offline ,since our company's dataset doesn’t allow any uploading.

    I use modelmaker running    ./run_modelmaker.sh AM62A config_detection.yaml  but failed. Here are the logs.Pray for your valuable advice and detailed step-by-step instructions

    modelmaker-log.txtmodeloptimization-PTC-log.txt

  • If there is any other ways to use MPU or DSP  

  • Hello,

    I understand -- I know this process can be confusing on the first attempt. I'll help however I can! I will respond to a few parts of your last comment. Please note that this is not an ordered list of steps

    A) general suggestion

    My first suggestion is to follow one of the examples in edgeai-tidl-tools and the custom-model workflow. Take one of our existing models like yolox (pretrained), and see how the examples/osrt_python/ort/onnxrt_ep.py is used to compile and run the model on the host. You can follow the steps in the README to transfer portions of edgeai-tidl-tools (examples, test_data, models, model-artifacts) to the EVM and run the model on static input. 

    • I recommend this not because it solves your problem. I recommend this so you can learn the development flow on a known working model, since your yolov8 will add more complexity.
      • I'll address your model in part D of the response.

    B) PTC logs

    Your jupyter notebook look like they are using an outdated repository. We migrated several months ago from multiple edgeai- repos for model training and development, and wrapped those into the edgeai-tensorlab repo to make repo-versioning more consistent and controlled. You can continue using these repos if you are intentionally targeting an older SDK version like 9.1, but otherwise I suggest cloning edgeai-tensorlab and working from that repository instead.

    I see you modelmaker log mentions you are targeting 9.1 SDK. The modeloptimization tooling was fairly new at this point, and I'm less confident about this version that more recent releases (r9.2 and r10.0 corresponding to SDK versions). 

    • I also see setup errors for scipy due to a BLAS/LAPACK dependency. This probably requires a system-level package that cannot be installed by pip3. I'd recommend searching with google for guidance here. 

    C) Modelmaker logs

    Looks like modelmaker trained the model but failed to compile. This means that pytorch trained yolox and produced an ONNX file. The next stage is compilation, which uses edgeai-benchmark -- this is an alternative to edgeai-tidl-tools.

    • edgeai-benchmark is for testing/compiling with larger datasets. It can compile the model and then test the accuracy of that model on your dataset
      • typically used by advanced users or as part of modelmaker workflow
    • edgeai-tidl-tools is for baseline evaluation of inference speed and accuracy on a smaller set of data. 
        • typically used for initial testing on new models. Often, this is sufficient for compiling model artifacts and testing 

      Your model failed during edgeai-benchmark compilation because it could not find a dependency: 


      ImportError: cannot import name 'onnx_model_opt' from 'osrt_model_tools.onnx_tools' (/home/zxb/.pyenv/versions/py310/lib/python3.10/site-packages/osrt_model_tools/onnx_tools/__init__.py)

      This dependency is setup as part of edgeai-tidl-tools, and the source python is under the /scripts directory (osrt_model_tools/onnx_tools). The imported function is meant to add some preprocessing layers to the model (YOLOX) that will allow uint8 input instead of float. This is why a later error complained about this input data type

      • What is the commit tag / branch of edgeai-tidl-tools? It may have setup a more recent version. The appropriate release would be 09_01_07_00We should be able to update with the proper tools and rerun this
        • alteratively, take the trained ONNX model and import using edgeai-tidl-tools standalone. This is related to suggestion A

      i followed u advice to use edgeai-optimization,when using PTC example ,FX Graph Mode Quantization is in maintenance mode.

      Could you explain more, please? I assume you mean that you followed the steps for modifying the yolov8 model based on the linked issue in edgeai-tensorlab repo. I'm not sure what you mean by "FX Graph Mode Quantization is in maintenance mode"

      D) YOLOv8 Guidance

      The steps detailed in that issue-7 explain how to train a modified yolov8 that will work well with TI's accelerator (C7xMMA). This training flow uses the upstream mmyolo repo, which is patched with files on that tensorlab issue-7 that will modify model structure to be better for C7xMMA. This does require some retraining.

      Since you have a pretrained model, you can use those weights as the starting point for retraining this modified structure. It should not take more than 100 epochs to retrain this way, but depending on your yolov8 source, it could take effort to get the pretrained weights aligned with the most appropriate yolov8-config The PTH for your yolov8 would need to have tensor/weights named similarly to what the mmyolo repo's yolov8 versions expect. 

      Our previous projects all used YOLOV8.Should i use  trained v8model to optimization (PTQ /QAT PTC).Or use edgeai-yolox/edgeai-mmdetetion to retrain a new v8-ti-lite model offline ,since our company's dataset doesn’t allow any uploading.

      I would encourage the first option here --

      1. use your trained model with a patched version of mmyolo to apply our model surgery
      2. retrain for 25-100 epochs at low learning rate.
        1. You may also run QAT at this time if you wish. I would suggest not on the first attempt. You may come back and do this later if accuracy from PTQ is not sufficient.
      3. Export the model to ONNX format with an accompanying prototxt file that describes the detection head.
      4. Use this ONNX and PROTOTXT to import/compile the model with edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py.
        1. Follow the custom-model guidelines in that repo 
        2. this will produce artifacts

      BR,
      Reese

    • Hello,

      Yesterday, I found yolov8_to-mmyolo. py in mmyolo/tools/model_comverters, so this morning I tried it out and obtained the mmyolo formatted. pth using yolov8n.pt. Then, I followed the patch version of mmyolo, Python projects/easyplot/tools/export_onnx.py. \ config \ yolov8 \ yolov8_n_cyncbn_fast_8xb16-500e_coco. py. \ myolov8n.psh -- work dir. \ tests \ -- img-size 640 640-- simplify -- iou threshold 0.65-- score threshold 0.25-- opset 12-- backend ONNSRUNTIME. This is the onnx

       ,But it won't generate protoxt files. When there was a Module NotFoundError: there was no module named 'edgeai_torchmodelopt', I copied edgeai-tensorlab-main\edgeai-modeloptimization\torchmodelopt to mmyolo\projects\easydeploy\tools\export_onnx.py and then encountered an ImportError: unable to import the name “_parse_stack_trace”(C:\Users\Admin\.conda\envs\mmyolo\lib\site-packages\torch\fx\graph.py).

      This seems to be the result of a conflict between torch2.0.1cu117 and mmcv2.0.1. At present, the MMYOLO patch has added a version conflict issue edgeai_torchmodelopt. MMYOLO only supports mim install "mmcv>=2.0.0rc4,<2.1.0", but mmcv below 2.1.0 only supports CUDA11.7~11.0 TORCH2.0. However, this results in torch.fx.graph not having passwrse_stack_trace for x~1.13. x. I tried installing torch from 2.0.0 to 2.4.0 and found that only torch 2.2. x~passwase_dack_trace was available. Is this a bug github.com/.../7 the last issue seems to be the same with mine.

       

      Here comes the new problems:HOW TO GET THE ONNX AND PROTOTXT MODELS. I tried to use mmyolo and mmdetection both can't use because of the same question github.com/.../7 .If i change torch to 2.4.1 using mmdetetion,it comes with  the problem of mmengine.registry's OPTIMIZERS . yolov8_s_syncbn_fast_8xb16-500e.py can be used in MMYOLO to generate onnx model,but in mmdetetion configs in mmyolo can't be uesd.

      KeyError: 'YOLODetector is not in the model registry. Please check whether the value of `YOLODetector` is correct or it was registered as expected. More details can be found at mmengine.readthedocs.io/.../config.html

      Is there any other way to get onnx and prototxt .

    • Hello,

      Okay, it sounds like you have a trained version such that .PTH weights is available, and you just need ONNX+PROTOTXT to enable import with TIDL tools.  Is your trained PTH weights from before or after edgeai_torchmodelopt calls? 

      We will need to look deeper into this -- I must ask for your patience here. The version incompatibility does indeed look frustrating.

      • Can you provide a list of your packages and versions?
        • If you tried multiple difference dependency configurations, please list those with corresponding versioning.
      • What is the commit ID / release tag for the edgeai-tensorlab repo?
        • The issue-7 was from July, so closest tag is probably r9.2

      BR,
      Reese

    • Hello,

          .PTH is from before  edgeai_torchmodelopt callsSince mmyolo only support mmcv>=2.0.0.rc4~<2.1.0, while mmcv2.0.1 need cuda<=11.7 and torch <=2.0.x,edgeai-modeloptimization needs  torch.fx.graph  which is only supported  torch 2.2.x~2.4.x.I tried lots of  versions matchs of these packages,but all failed. 

      Here is the  list of my packages and versions.

      WARNING: Ignoring invalid distribution -orch (c:\users\admin\.conda\envs\mmyolo\lib\site-packages)
      Package Version
      ---------------------------- ------------
      absl-py 2.1.0
      addict 2.4.0
      aiofiles 23.2.1
      aiohappyeyeballs 2.4.0
      aiohttp 3.10.5
      aiosignal 1.3.1
      albucore 0.0.17
      albumentations 1.4.18
      aliyun-python-sdk-core 2.15.2
      aliyun-python-sdk-kms 2.16.4
      altgraph 0.17.4
      annotated-types 0.7.0
      antlr4-python3-runtime 4.9.3
      anyio 4.4.0
      appdirs 1.4.4
      astunparse 1.6.3
      async-timeout 4.0.3
      attrs 24.2.0
      audioread 3.0.1
      beautifulsoup4 4.12.3
      blinker 1.8.2
      branca 0.7.2
      Brotli 1.1.0
      cachetools 5.5.0
      certifi 2024.7.4
      cffi 1.17.0
      charset-normalizer 3.3.2
      click 8.1.7
      cnocr 2.3.0.3
      cnstd 1.2.4.2
      colorama 0.4.6
      coloredlogs 15.0.1
      contourpy 1.1.1
      crcmod 1.7
      cryptography 43.0.0
      cycler 0.12.1
      Cython 3.0.11
      datasets 2.21.0
      DeepFilterLib 0.5.2
      deepfilternet 0.5.2
      dill 0.3.8
      docker-pycreds 0.4.0
      editdistance 0.8.1
      einops 0.8.0
      eiq 2.2.0
      et-xmlfile 1.1.0
      eval_type_backport 0.2.0
      exceptiongroup 1.2.2
      fastapi 0.112.2
      ffmpy 0.4.0
      filelock 3.14.0
      Flask 3.0.3
      flatbuffers 24.3.25
      folium 0.17.0
      fonttools 4.53.1
      frozenlist 1.4.1
      fsspec 2024.6.1
      funasr 1.0.27
      gast 0.4.0
      gdown 5.2.0
      geographiclib 2.0
      geopy 2.4.1
      gitdb 4.0.11
      GitPython 3.1.43
      google-auth 2.34.0
      google-auth-oauthlib 1.0.0
      google-pasta 0.2.0
      gradio 4.42.0
      gradio_client 1.3.0
      greenlet 3.1.1
      grpcio 1.66.1
      h11 0.14.0
      h2 4.1.0
      h5py 3.11.0
      hpack 4.0.0
      httpcore 1.0.5
      httpx 0.27.2
      huggingface-hub 0.24.6
      humanfriendly 10.0
      hydra-core 1.3.2
      hyperframe 6.0.1
      idna 3.8
      ImageHash 4.3.1
      imageio 2.35.1
      imgviz 1.7.5
      importlib_metadata 8.5.0
      importlib_resources 6.4.4
      itsdangerous 2.2.0
      jaconv 0.4.0
      jamo 0.4.1
      jax 0.4.13
      jieba 0.42.1
      Jinja2 3.1.4
      jmespath 0.10.0
      joblib 1.4.2
      kaldiio 2.18.0
      keras 2.13.1
      kiwisolver 1.4.5
      labelImg 1.8.6
      labelme 5.5.0
      lazy_loader 0.4
      libclang 18.1.1
      librosa 0.10.2
      lightning-utilities 0.11.8
      llvmlite 0.41.1
      loguru 0.7.2
      lxml 5.3.0
      Markdown 3.7
      markdown-it-py 3.0.0
      MarkupSafe 2.1.5
      matplotlib 3.7.5
      mdurl 0.1.2
      mediapipe 0.10.11
      ml-dtypes 0.2.0
      mmcv 2.0.1
      mmdet 3.3.0
      mmengine 0.10.5
      mmyolo 0.6.0
      model-index 0.1.11
      modelscope 1.17.1
      more-itertools 10.4.0
      mpmath 1.3.0
      msgpack 1.0.8
      multidict 6.0.5
      multiprocess 0.70.16
      mysql 0.0.3
      mysql-connector-python 9.0.0
      mysqlclient 2.2.4
      natsort 8.4.0
      networkx 3.1
      numba 0.58.1
      numpy 1.23.5
      oauthlib 3.2.2
      omegaconf 2.3.0
      onnx 1.17.0
      onnxruntime 1.19.0
      onnxsim 0.4.36
      openai-whisper 20231117
      opencv-contrib-python 4.10.0.84
      opencv-python 4.10.0.84
      opencv-python-headless 4.10.0.84
      opendatalab 0.0.10
      openmim 0.3.9
      openpyxl 3.1.5
      openxlab 0.1.2
      opt-einsum 3.3.0
      ordered-set 4.1.0
      orjson 3.10.7
      oss2 2.17.0
      outcome 1.3.0.post0
      packaging 24.2
      pandas 2.0.3
      pefile 2023.2.7
      pillow 10.4.0
      pip 24.3.1
      platformdirs 4.3.6
      Polygon3 3.0.9.1
      pooch 1.8.2
      prettytable 3.11.0
      protobuf 3.20.3
      psutil 6.1.0
      py-cpuinfo 9.0.0
      pyarrow 17.0.0
      pyasn1 0.6.0
      pyasn1_modules 0.4.0
      PyAudio 0.2.14
      pyclipper 1.3.0.post6
      pycocotools 2.0.7
      pycocotools-windows 2.0.0.2
      pycparser 2.22
      pycryptodome 3.20.0
      pydantic 2.8.2
      pydantic_core 2.20.1
      pydub 0.25.1
      Pygments 2.18.0
      pyinstaller 6.11.1
      pyinstaller-hooks-contrib 2024.10
      PyMySQL 1.1.1
      pynndescent 0.5.13
      pyparsing 3.1.4
      PyQt5 5.15.11
      PyQt5-Qt5 5.15.2
      PyQt5_sip 12.15.0
      pyreadline3 3.4.1
      PySocks 1.7.1
      PySoundFile 0.9.0.post1
      pytesseract 0.3.13
      python-dateutil 2.9.0.post0
      python-multipart 0.0.9
      pytorch-wpe 0.0.1
      pytz 2023.4
      PyWavelets 1.4.1
      pywin32 308
      pywin32-ctypes 0.2.3
      PyYAML 6.0.2
      QtPy 2.4.1
      regex 2024.7.24
      requests 2.28.2
      requests-oauthlib 2.0.0
      rich 13.4.2
      rotary-embedding-torch 0.6.5
      rsa 4.9
      ruff 0.6.3
      safetensors 0.4.4
      scikit-image 0.21.0
      scikit-learn 1.3.2
      scipy 1.10.1
      seaborn 0.13.2
      selenium 4.26.1
      semantic-version 2.10.0
      sentencepiece 0.2.0
      sentry-sdk 2.18.0
      setproctitle 1.3.3
      setuptools 60.2.0
      shapely 2.0.6
      shellingham 1.5.4
      simplejson 3.19.3
      six 1.16.0
      smmap 5.0.1
      sniffio 1.3.1
      sortedcontainers 2.4.0
      sounddevice 0.5.0
      soundfile 0.12.1
      soupsieve 2.6
      soxr 0.3.7
      SQLAlchemy 2.0.35
      starlette 0.38.4
      sympy 1.13.2
      tabulate 0.9.0
      tensorboard 2.13.0
      tensorboard-data-server 0.7.2
      tensorboardX 2.6.2.2
      tensorflow 2.13.0
      tensorflow-estimator 2.13.0
      tensorflow-intel 2.13.0
      tensorflow-io-gcs-filesystem 0.31.0
      termcolor 2.4.0
      terminaltables 3.1.10
      Theano 1.0.5
      threadpoolctl 3.5.0
      tifffile 2023.7.10
      tiktoken 0.7.0
      timm 1.0.9
      tomli 2.0.1
      tomlkit 0.12.0
      torch 1.13.0+cu117
      torchaudio 0.13.0+cu117
      torchinfo 1.8.0
      torchmetrics 1.5.2
      torchvision 0.14.0+cu117
      tqdm 4.65.2
      trio 0.27.0
      trio-websocket 0.11.1
      ttach 0.0.3
      typer 0.12.5
      typing_extensions 4.12.2
      tzdata 2024.1
      ultralytics 8.2.85
      ultralytics-thop 2.0.11
      umap-learn 0.5.6
      Unidecode 1.3.8
      urllib3 1.26.20
      uvicorn 0.30.6
      wandb 0.18.6
      wcwidth 0.2.13
      websocket-client 1.8.0
      websockets 12.0
      Werkzeug 3.0.4
      wheel 0.45.1
      win_inet_pton 1.1.0
      win32-setctime 1.1.0
      wrapt 1.16.0
      wsproto 1.2.0
      xlrd 2.0.1
      xlutils 2.0.0
      xlwt 1.3.0
      xxhash 3.5.0
      xyzservices 2024.6.0
      yapf 0.40.2
      yarl 1.9.4
      zipp 3.20.2
      zstandard 0.23.0
      (mmyolo) PS D:\zxb\accc\mmyolo> 

      Then i tried edgeai-mmdetection,it can use edgeai-modeloptimization,but it comes with the problems of configs.I tried to copy configs from mmyolo/config/yolov8/    but some module are not supported need to use mmengine registe custom module

      Also,i tried edgeai-modelmaker     ./run_modelmaker.sh   AM62A   ./config_detection.yaml   ,afer i resolved other problems. It  ended with FileNotFoundError: [Errno 2] No such file or directory: '/home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/compilation/AM62A/pkg/artifacts.yaml'.Under the same folder the model is optimized ,but artifacts folder is empty.

      modelmaker-trainlog.txt
      2024-12-03 14:28:39,693 - mmdet - INFO - Exp name: yolox_nano_lite.py
      2024-12-03 14:28:39,693 - mmdet - INFO - Epoch(val) [1][107]	bbox_mAP: 0.1130, bbox_mAP_50: 0.2410, bbox_mAP_75: 0.0960, bbox_mAP_s: 0.0020, bbox_mAP_m: 0.0940, bbox_mAP_l: 0.4320, bbox_mAP_copypaste: 0.113 0.241 0.096 0.002 0.094 0.432
      /home/zxb/.pyenv/versions/py310/lib/python3.10/site-packages/mmcv/onnx/info.py:20: UserWarning: DeprecationWarning: This function will be deprecated in future. Welcome to use the unified model deployment toolbox MMDeploy: https://github.com/open-mmlab/mmdeploy
        warnings.warn(msg)
      /home/zxb/.pyenv/versions/py310/lib/python3.10/site-packages/mmcv/tensorrt/init_plugins.py:51: UserWarning: DeprecationWarning: This function will be deprecated in future. Welcome to use the unified model deployment toolbox MMDeploy: https://github.com/open-mmlab/mmdeploy
        warnings.warn(msg)
      /home/zxb/.pyenv/versions/py310/lib/python3.10/site-packages/torch/onnx/symbolic_opset9.py:1248: UserWarning: This model contains a squeeze operation on dimension 1. If the model is intended to be used with dynamic input shapes, please use opset version 11 to export the model.
        warnings.warn(
      ============== Diagnostic Run torch.onnx.export version 2.0.1+cpu ==============
      verbose: False, log level: Level.ERROR
      ======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================
      
      Successfully exported ONNX model: /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/training/model.onnx
      Trained model is at: /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/training
      
      SUCCESS: ModelMaker - Training completed.
      
      INFO:20241203-142844: model import is in progress - please see the log file for status.
      configs to run: ['od-8200']
      number of configs: 1
      
      INFO:20241203-142844: parallel_run - parallel_processes:1 parallel_devices=[0]
      TASKS                                                       |          |     0% 0/1| [< ]
      INFO:20241203-142844: starting process on parallel_device - 0   0%|          || 0/1 [00:00<?, ?it/s]
      
      INFO:20241203-142844: starting - od-8200
      INFO:20241203-142844: model_path - /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/training/model.onnx
      INFO:20241203-142844: model_file - /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/compilation/AM62A/work/od-8200/model/model.onnx
      INFO:20241203-142844: quant_file - /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/compilation/AM62A/work/od-8200/model/model_qparams.prototxt
      Downloading 1/1: /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/training/model.onnx
      Download done for /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/training/model.onnx
      Downloading 1/1: /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/training/model.onnx
      Download done for /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/training/model.onnx
      Converted model is valid!
      
      INFO:20241203-142844: running - od-8200
      INFO:20241203-142844: pipeline_config - {'task_type': 'detection', 'dataset_category': 'coco', 'calibration_dataset': <edgeai_benchmark.datasets.modelmaker_datasets.ModelMakerDetectionDataset object at 0x713fe6162800>, 'input_dataset': <edgeai_benchmark.datasets.modelmaker_datasets.ModelMakerDetectionDataset object at 0x713fe7d34a90>, 'preprocess': <edgeai_benchmark.preprocess.PreProcessTransforms object at 0x713fc90e0820>, 'session': <edgeai_benchmark.sessions.onnxrt_session.ONNXRTSession object at 0x713fc90e0880>, 'postprocess': <edgeai_benchmark.postprocess.PostProcessTransforms object at 0x713fc90e0b50>, 'metric': {'label_offset_pred': 1}, 'model_info': {'metric_reference': {'accuracy_ap[.5:.95]%': None}, 'model_shortlist': 10}}
      TASKS                                                       | 100%|██████████|| 1/1 [00:00<00:00,  1.37it/s]
      
      
      
      INFO:20241203-142844: model inference is in progress - please see the log file for status.
      configs to run: ['od-8200']
      number of configs: 1
      
      INFO:20241203-142844: parallel_run - parallel_processes:1 parallel_devices=[0]
      TASKS                                                       |          |     0% 0/1| [< ]
      INFO:20241203-142844: starting process on parallel_device - 0   0%|          || 0/1 [00:00<?, ?it/s]
      
      INFO:20241203-142844: starting - od-8200
      INFO:20241203-142844: model_path - /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/training/model.onnx
      INFO:20241203-142844: model_file - /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/compilation/AM62A/work/od-8200/model/model.onnx
      INFO:20241203-142844: quant_file - /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/compilation/AM62A/work/od-8200/model/model_qparams.prototxt
      
      INFO:20241203-142844: running - od-8200
      INFO:20241203-142844: pipeline_config - {'task_type': 'detection', 'dataset_category': 'coco', 'calibration_dataset': <edgeai_benchmark.datasets.modelmaker_datasets.ModelMakerDetectionDataset object at 0x713fe6162800>, 'input_dataset': <edgeai_benchmark.datasets.modelmaker_datasets.ModelMakerDetectionDataset object at 0x713fe7d34a90>, 'preprocess': <edgeai_benchmark.preprocess.PreProcessTransforms object at 0x713fc90e0820>, 'session': <edgeai_benchmark.sessions.onnxrt_session.ONNXRTSession object at 0x713fc90e0880>, 'postprocess': <edgeai_benchmark.postprocess.PostProcessTransforms object at 0x713fc90e0b50>, 'metric': {'label_offset_pred': 1}, 'model_info': {'metric_reference': {'accuracy_ap[.5:.95]%': None}, 'model_shortlist': 10}}
      TASKS                                                       | 100%|██████████|| 1/1 [00:00<00:00,  1.86it/s]
      
      
      packaging artifacts to /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/compilation/AM62A/pkg please wait...
      WARNING:20241203-142845: could not package - /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/compilation/AM62A/work/od-8200
      Traceback (most recent call last):
        File "/home/zxb/Desktop/ti/edgeai-modelmaker/./scripts/run_modelmaker.py", line 141, in <module>
          main(config)
        File "/home/zxb/Desktop/ti/edgeai-modelmaker/./scripts/run_modelmaker.py", line 80, in main
          model_runner.run()
        File "/home/zxb/Desktop/ti/edgeai-modelmaker/edgeai_modelmaker/ai_modules/vision/runner.py", line 187, in run
          self.model_compilation.run()
        File "/home/zxb/Desktop/ti/edgeai-modelmaker/edgeai_modelmaker/ai_modules/vision/compilation/edgeai_benchmark.py", line 269, in run
          edgeai_benchmark.interfaces.package_artifacts(self.settings, self.work_dir, out_dir=self.package_dir, custom_model=True)
        File "/home/zxb/Desktop/ti/edgeai-tidl-tools/examples/edgeai-benchmark/edgeai_benchmark/interfaces/run_package.py", line 271, in package_artifacts
          with open(os.path.join(out_dir,'artifacts.yaml'), 'w') as fp:
      FileNotFoundError: [Errno 2] No such file or directory: '/home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/compilation/AM62A/pkg/artifacts.yaml'
      

    • Thank you for supplying this. What is the commit ID for mmyolo repo? It is supposed to be 8c4d9dc503dc8e327bec8147e8dc97124052f693 to match issue-7.  I agree that the dependencies as you've shown them are incompatible. There may be other 

      Using the mmdetection from our edgeai-modelmaker is a different topic, and I would recommend looking at this in a separate thread. It is difficult to track multiple topics that are so similar in the same thread -- I am concerned it will cause additional confusion. Can you make a separate e2e thread for training/compiling yolox model?

      • looks like training finished and compilation failed to start
      • In the new thread, please show the configuration file passed to model maker and show the directory structure/tree for that  /home/zxb/Desktop/ti/edgeai-modelmaker/data/projects/tiscapes2017_driving/run/20241203-142720/yolox_nano_lite/
        • Note the date string will change if you rerun this. I am interested what is under the training and compilation directories. 

      BR,
      Reese

    • Hello,

             This is different pyenv and different projects with different problems.Considering it is still in the evaluation stage,we are trying to use our v8 models to deploy DMS/OMS in AM62A which will be used in car.So we  maked two plans.

              Plan A using v8 model  and convert it to mmyolov8-lite ,but failed with above environment problems to get onnx and prototxt which will be used to compose.
      mmyolo:commit 8c4d9dc503dc8e327bec8147e8dc97124052f693 (HEAD, tag: v0.6.0, origin/main, origin/HEAD, main)

      mmdetection:commit 7146beba638862b5df251bc248524cfcd8338c9b (HEAD -> r9.1, origin/r9.1, origin/main, origin/HEAD)

      edgeai-modelmaker: commit b4e9b7fda9b9d3ee0f8190bdb49792e001d60e8f (HEAD -> main, origin/r9.1, origin/main, origin/HEAD)

      edgeai-tidl-tools: commit efae61031b31aa2ba5491c03bb808216d3baef14 (HEAD -> master, tag: 10_00_08_00, origin/rel_10_00, origin/master, origin/HEAD)

              Plan B is try to use modelmaker ,using yolox example ,but failed with above log problems ,in this case we just  change the yaml's   train- epoch=1..   /run_modelmaker.sh AM62A ./config_detection.yaml            

      config_detection.yaml :

      ====================================================================================
      common:
      target_module: 'vision'
      task_type: 'detection'
      target_device: 'TDA4VM'
      # run_name can be any string, but there are some special cases:
      # {date-time} will be replaced with datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
      # {model_name} will be replaced with the name of the model
      run_name: '{date-time}/{model_name}'

      dataset:
      # enable/disable dataset loading
      enable: True #False
      # max_num_files: [750, 250] #None

      # Object Detection Dataset Examples:
      # -------------------------------------
      # Example 1, (known datasets): 'widerface_detection', 'pascal_voc0712', 'coco_detection', 'udacity_selfdriving', 'tomato_detection', 'tiscapes2017_driving'
      # dataset_name: widerface_detection
      # -------------------------------------
      # Example 2, give a dataset name and input_data_path.
      # input_data_path could be a path to zip file, tar file, folder OR http, https link to zip or tar files
      # for input_data_path these are provided with this repository as examples:
      # 'software-dl.ti.com/.../tiscapes2017_driving.zip'
      # 'software-dl.ti.com/.../animal_detection.zip'
      # -------------------------------------
      # Example 3, give image folders with annotation files (require list with values for both train and val splits)
      # dataset_name: coco_detection
      # input_data_path: ["./data/projects/coco_detection/dataset/train2017",
      # "./data/projects/coco_detection/dataset/val2017"]
      # input_annotation_path: ["./data/projects/coco_detection/dataset/annotations/instances_train2017.json",
      # "./data/projects/coco_detection/dataset/annotations/instances_val2017.json"]
      # -------------------------------------
      dataset_name: tiscapes2017_driving
      input_data_path: 'software-dl.ti.com/.../tiscapes2017_driving.zip'

      training:
      # enable/disable training
      enable: True #False

      # Object Detection model chosen can be changed here if needed
      # options are: 'yolox_s_lite', 'yolox_tiny_lite', 'yolox_nano_lite', 'yolox_pico_lite', 'yolox_femto_lite'
      model_name: 'yolox_nano_lite'

      training_epochs: 1 #30
      # batch_size: 8 #32
      # learning_rate: 0.005
      # num_gpus: 0 #1 #4

      compilation:
      # enable/disable compilation
      enable: True #False
      # tensor_bits: 8 #16 #32

      =====================================================================================2818.modelmaker-trainlog.txt

    • Hello,

      Understood, option A would be preferred, I presume. 

      mmyolo:commit 8c4d9dc503dc8e327bec8147e8dc97124052f693 (HEAD, tag: v0.6.0, origin/main, origin/HEAD, main)

      mmdetection:commit 7146beba638862b5df251bc248524cfcd8338c9b (HEAD -> r9.1, origin/r9.1, origin/main, origin/HEAD)

      edgeai-modelmaker: commit b4e9b7fda9b9d3ee0f8190bdb49792e001d60e8f (HEAD -> main, origin/r9.1, origin/main, origin/HEAD)

      edgeai-tidl-tools: commit efae61031b31aa2ba5491c03bb808216d3baef14 (HEAD -> master, tag: 10_00_08_00, origin/rel_10_00, origin/master, origin/HEAD)

      Thank you for including, this is helpful.

      Edgeai-modelmaker repository has been moved into the edgeai-tensorlab repo. Please use edgeai-tensorlab repository instead of edgeai-modelmaker, since the latter has been archived. The same is true of edgeai-mmdetection. Please setup the edgeai-tensorlab repo to replace these -- I would recommend using a virtual environment to keep dependencies separate. I believe you have the archived versions of modelmaker and mmdetection repos installed. Edgeai-tensorlab will include a version of those two archived repos that have been version controlled against each other. 

      Your mmyolo version should be okay. When you setup edgeai-tensorlab, I would recommend using the r9.2 branch/tag. I see from git-blames that r10.0 introduces this versioning issue within edgeai-tensorlab/edgeai-modeloptimization, whereas 9.2 does not include this. The following dependencies are expected to work for this r9.2 branch + upstream mmyolo

      torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2

      Regarding option B

      In your screenshot, I see that a model.onnx and model.prototxt are present. Do these files look normal, e.g. can you load .ONNX in Netron and does prototxt look similar to the one here? 

      Compilation may have failed, but training completed and should have produced the trained model. The error above about failing to open artifacts.yaml is most likely because the path there was not created (MODEL_DIR/AM62A/compilation/pkg). Is that 'artifacts' directory empty under AM62A/compilation/work/od-8200/artifacts?