TDA4VM: TDA4 - TIDL segmentation training error

Rio Chan

Genius 14865 points

Part Number: TDA4VM

https://git.ti.com/cgit/jacinto-ai/pytorch-jacinto-ai-devkit/about/docs/Semantic_Segmentation.md

Hi CHamps:

I'm studying this link as the above.

While issuing this cmd:

python3 ./scripts/train_segmentation_main.py --model_name deeplabv3lite_mobilenetv2_tv --dataset_name cityscapes_segmentation --data_path ./data/datasets/cityscapes/data --img_resize 384 768 --output_size 1024 2048 --pretrained download.pytorch.org/.../mobilenet_v2-b0353104.pth --gpus 0 1

My ubuntu 18.04 will give me this error as the below, how to fix it?

=> will save everything to ./data/checkpoints/cityscapes_segmentation/2021-04-07_21-26-32_cityscapes_segmentation_deeplabv3lite_mobilenetv2_tv_resize768x384_traincrop768x384/training
=> fetching images in './data/datasets/cityscapes/data'
Traceback (most recent call last):
File "./scripts/train_segmentation_main.py", line 163, in <module>
train_pixel2pixel.main(args)
File "/opt/TDA4_Jacinto_AI_devkit_stuffs/pytorch-jacinto-ai-devkit/modules/pytorch_jacinto_ai/engine/train_pixel2pixel.py", line 279, in main
train_dataset, val_dataset = xvision.datasets.pixel2pixel.__dict__[args.dataset_name](args.dataset_config, args.data_path, split=split_arg, transforms=transforms)
File "/opt/TDA4_Jacinto_AI_devkit_stuffs/pytorch-jacinto-ai-devkit/modules/pytorch_jacinto_ai/xvision/datasets/pixel2pixel/cityscapes_plus.py", line 431, in cityscapes_segmentation
*args, **kwargs)
File "/opt/TDA4_Jacinto_AI_devkit_stuffs/pytorch-jacinto-ai-devkit/modules/pytorch_jacinto_ai/xvision/datasets/pixel2pixel/cityscapes_plus.py", line 234, in __init__
raise Exception("> No files for split=[%s] found in %s" % (split, self.segmentation_base))
Exception: > No files for split=[train] found in ./data/datasets/cityscapes/data/gtFine/train

ps. I'm using Python3 to issue this cmd.

BR Rio

over 4 years ago

0 Manu Mathew over 4 years ago

TI__Genius 11466 points

Hi Rio,

Cityscapes dataset cannot be automatically downloaded. You have to manually download and make it available at the location mentioned in the error message.

Best regards,

Manu

0 Rio Chan over 4 years ago in reply to Manu Mathew

TI__Genius 14865 points

Hi Manu:

THanks for your helping.

I have downloaded the CityScapes manually.

Now, I'm issuing this cmd and have the below error log as the below:

python3 ./scripts/train_segmentation_main.py --dataset_name cityscapes_segmentation --model_name fpnlite_pixel2pixel_aspp_regnetx800mf --data_path ./data/datasets/cityscapes/data --img_resize 384 768 --output_size 1024 2048 --gpus 0 1 --pretrained https://dl.fbaipublicfiles.com/pycls/dds_baselines/160906036/RegNetX-800MF_dds_8gpu.pyth

May you guide me how to solve it?

THanks.

BR Rio

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:2973: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
=> Size = [384, 768], GFLOPs = 14.572910592, GMACs = 7.286455296
/usr/local/lib/python3.7/dist-packages/torch/onnx/symbolic_helper.py:243: UserWarning: You are trying to export the model with onnx:Upsample for ONNX opset version 9. This operator might cause results to not match the expected results by PyTorch.
ONNX's Upsample/Resize operator did not match Pytorch's Interpolation until opset 11. Attributes to determine how to transform the input were added in onnx:Resize in opset 11 to support Pytorch's behavior (like coordinate_transformation_mode and nearest_mode).
We recommend using opset 11 and above for models using this operator.
"" + str(_export_onnx_opset_version) + ". "
Traceback (most recent call last):
File "./scripts/train_segmentation_main.py", line 163, in <module>
train_pixel2pixel.main(args)
File "/opt/TDA4_Jacinto_AI_devkit_stuffs/pytorch-jacinto-ai-devkit/modules/pytorch_jacinto_ai/engine/train_pixel2pixel.py", line 420, in main
model = model.cuda()
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 307, in cuda
return self._apply(lambda t: t.cuda(device))
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 203, in _apply
module._apply(fn)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 203, in _apply
module._apply(fn)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 203, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 225, in _apply
param_applied = fn(param)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 307, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "/usr/local/lib/python3.7/dist-packages/torch/cuda/__init__.py", line 149, in _lazy_init
_check_driver()
File "/usr/local/lib/python3.7/dist-packages/torch/cuda/__init__.py", line 47, in _check_driver
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

0 Manu Mathew over 4 years ago in reply to Rio Chan

TI__Genius 11466 points

That's because you have installed CPU version of PyTorch. Please install the GPU version of PyTorch and also run on machines with GPUs.

Processors

Processors forum

TDA4VM: TDA4 - TIDL segmentation training error