This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CCS/TDA4VM: The model get a pool result after QAT

Part Number: TDA4VM

Tool/software: Code Composer Studio

Dear Sir,

I am using Quantization Aware Training (QAT) to incorporate into an existing PyTorch training code.I did it step by step according to the example provided on the official website:

#********************************program start*****************************

from pytorch_jacinto_ai import xnn

# create your model here:
model = build_detector()

# create a dummy input - this is required to analyze the model - fill in the input image size expected by your model.
dummy_input = torch.rand((1,3,384,768))

# wrap your model in xnn.quantize.QuantTrainModule. 
# once it is wrapped, the actual model is in model.module
model = xnn.quantize.QuantTrainModule(model, dummy_input=dummy_input)

# load your pretrained weights here into model.module
pretrained_data = torch.load(pretrained_path)
model.module.load_state_dict(pretrained_data)

# your training loop here with with loss, backward, optimizer and scheduler. 
# this is the usual training loop - but use a lower learning rate such as 5e-5
....
....

# save the model - the trained module is in model.module
torch.save(model.module.state_dict(), os.path.join(save_path,'model.pth'))
torch.onnx.export(model.module, dummy_input, os.path.join(save_path,'model.onnx'), export_params=True, verbose=False)

#********************************program end*****************************

The learning rate is 5e-5,  epoch size is 25. During QAT, the loss value is very large and not converge. After the above progress, I got a new mode.

However, when I use the new model from QAT to do inference, the results are very poor.

Before QAT, I can get a detection result as follow:

After QAT, the detection result as follow::

On TDA4VM and PC, the conclusion is consistent.

No other code has been modified. Why the accuracy of the QAT-trained model has declined so much.