This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: TDA4 - TIDL segmentation training error

Part Number: TDA4VM

https://git.ti.com/cgit/jacinto-ai/pytorch-jacinto-ai-devkit/about/docs/Semantic_Segmentation.md

Hi CHamps:

I'm studying this link as the above.

While issuing this cmd:

python3 ./scripts/train_segmentation_main.py --model_name deeplabv3lite_mobilenetv2_tv --dataset_name cityscapes_segmentation --data_path ./data/datasets/cityscapes/data --img_resize 384 768 --output_size 1024 2048 --pretrained download.pytorch.org/.../mobilenet_v2-b0353104.pth --gpus 0 1

My ubuntu 18.04 will give me this error as the below, how to fix it?

=> will save everything to ./data/checkpoints/cityscapes_segmentation/2021-04-07_21-26-32_cityscapes_segmentation_deeplabv3lite_mobilenetv2_tv_resize768x384_traincrop768x384/training
=> fetching images in './data/datasets/cityscapes/data'
Traceback (most recent call last):
File "./scripts/train_segmentation_main.py", line 163, in <module>
train_pixel2pixel.main(args)
File "/opt/TDA4_Jacinto_AI_devkit_stuffs/pytorch-jacinto-ai-devkit/modules/pytorch_jacinto_ai/engine/train_pixel2pixel.py", line 279, in main
train_dataset, val_dataset = xvision.datasets.pixel2pixel.__dict__[args.dataset_name](args.dataset_config, args.data_path, split=split_arg, transforms=transforms)
File "/opt/TDA4_Jacinto_AI_devkit_stuffs/pytorch-jacinto-ai-devkit/modules/pytorch_jacinto_ai/xvision/datasets/pixel2pixel/cityscapes_plus.py", line 431, in cityscapes_segmentation
*args, **kwargs)
File "/opt/TDA4_Jacinto_AI_devkit_stuffs/pytorch-jacinto-ai-devkit/modules/pytorch_jacinto_ai/xvision/datasets/pixel2pixel/cityscapes_plus.py", line 234, in __init__
raise Exception("> No files for split=[%s] found in %s" % (split, self.segmentation_base))
Exception: > No files for split=[train] found in ./data/datasets/cityscapes/data/gtFine/train

ps. I'm using Python3 to issue this cmd.

BR Rio

  • Hi Rio,

    Cityscapes dataset cannot be automatically downloaded. You have to manually download and make it available at the location mentioned in the error message.

    Best regards,

    Manu

  • Hi Manu:

    THanks for your helping.

    I have downloaded the CityScapes manually.

    Now, I'm issuing this cmd and have the below error log as the below:

    python3 ./scripts/train_segmentation_main.py --dataset_name cityscapes_segmentation --model_name fpnlite_pixel2pixel_aspp_regnetx800mf --data_path ./data/datasets/cityscapes/data --img_resize 384 768 --output_size 1024 2048 --gpus 0 1 --pretrained https://dl.fbaipublicfiles.com/pycls/dds_baselines/160906036/RegNetX-800MF_dds_8gpu.pyth

    May you guide me how to solve it?

    THanks.

    BR Rio

    /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:2973: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
    "See the documentation of nn.Upsample for details.".format(mode))
    => Size = [384, 768], GFLOPs = 14.572910592, GMACs = 7.286455296
    /usr/local/lib/python3.7/dist-packages/torch/onnx/symbolic_helper.py:243: UserWarning: You are trying to export the model with onnx:Upsample for ONNX opset version 9. This operator might cause results to not match the expected results by PyTorch.
    ONNX's Upsample/Resize operator did not match Pytorch's Interpolation until opset 11. Attributes to determine how to transform the input were added in onnx:Resize in opset 11 to support Pytorch's behavior (like coordinate_transformation_mode and nearest_mode).
    We recommend using opset 11 and above for models using this operator.
    "" + str(_export_onnx_opset_version) + ". "
    Traceback (most recent call last):
    File "./scripts/train_segmentation_main.py", line 163, in <module>
    train_pixel2pixel.main(args)
    File "/opt/TDA4_Jacinto_AI_devkit_stuffs/pytorch-jacinto-ai-devkit/modules/pytorch_jacinto_ai/engine/train_pixel2pixel.py", line 420, in main
    model = model.cuda()
    File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 307, in cuda
    return self._apply(lambda t: t.cuda(device))
    File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 203, in _apply
    module._apply(fn)
    File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 203, in _apply
    module._apply(fn)
    File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 203, in _apply
    module._apply(fn)
    [Previous line repeated 2 more times]
    File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 225, in _apply
    param_applied = fn(param)
    File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 307, in <lambda>
    return self._apply(lambda t: t.cuda(device))
    File "/usr/local/lib/python3.7/dist-packages/torch/cuda/__init__.py", line 149, in _lazy_init
    _check_driver()
    File "/usr/local/lib/python3.7/dist-packages/torch/cuda/__init__.py", line 47, in _check_driver
    raise AssertionError("Torch not compiled with CUDA enabled")
    AssertionError: Torch not compiled with CUDA enabled

  • That's because you have installed CPU version of PyTorch. Please install the GPU version of PyTorch and also run on machines with GPUs.