Hi
I have been performing PTQ (mixed-precision mode) on the trained model using TiDL framework.
I feel calibration time and effort is too long to meet desired accuracy.
So, I check the QAT on the edgeai-torchivsion.
I want to use mixed precsion for minimizing the losees of information.
For example, I want almost layers are quantized by 8-bit and
a few last layers or specific layers are quantized by 16-bit(depth estimation or achorfree detector or segmentation)
I find the bitwidth setting in QuantTrainModule.
Can I set mixed precsion(8-bit or 16-bit) on QAT as below?
I handle my custom multi-task model.
backbone, neck : 8-bit quantization(weight, activation)
last convolution layers in dectection header : 16-bit quantization(weight, activation)
last convolution layers in segmentation/depth header : 16-bit quantization(weight, activation)
if 'training' in args.phase: model = xnn.quantize.QuantTrainModule(model, per_channel_q=args.per_channel_q, histogram_range=args.histogram_range, bitwidth_weights=args.bitwidth_weights, bitwidth_activations=args.bitwidth_activations, constrain_bias=args.constrain_bias, dummy_input=dummy_input)
Thanks.