TMS320F28P550SG: Quantization destroy performance of model

Gal Pascual

Part Number: TMS320F28P550SG

Hello,

I have a pre-trained model that I want to use on NPU.

The issue is that when quantizing it with TINPUTinyMLQATFxModule there is a big drop in performance (the loss goes from 0.25 to 0.34) wheras this issue doesn't occur when I quantize it using torchao (loss goes from 0.25 to 0.26).

You can find in the zip the quantization scripts as well as the logs of the quantization. run_qat_ti.zip

I am using a very simple model :

        self.cnn_backbone = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=8, kernel_size=(3, 1), padding=(1, 0), stride=(2, 1)),
            nn.BatchNorm2d(num_features=8),
            nn.ReLU(),

            nn.Conv2d(in_channels=8, out_channels=16, kernel_size=(3, 1), padding=(1, 0), stride=(2, 1)),
            nn.BatchNorm2d(num_features=16),
            nn.ReLU(),

            nn.Conv2d(in_channels=16, out_channels=32, kernel_size=(3, 1), padding=(1, 0), stride=(1, 1)),
            nn.BatchNorm2d(num_features=32),
            nn.ReLU()
        )
        
        self.flatten = nn.Flatten()

        dummy_input = torch.randn(1, 1, input_size, 1)
        conv_output_size = self.cnn_backbone(dummy_input).view(1, -1).size(1)

        # MLP Head
        self.mlp_head = nn.Sequential(
            nn.Linear(conv_output_size, 64),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(64, output_size)
        )

Some models that had a lower loss, had an even bigger loss of performance when using TI's library for quantization. Do you have an idea why does it happen ?

Best,

Gal Pascual

3 months ago

0 Gal Pascual 3 months ago

Prodigy 10 points

To provide you a more reproducible example, I took one of your examples which is used for regression https://github.com/TexasInstruments/tinyml-tensorlab/blob/main/tinyml-modeloptimization/torchmodelopt/examples/torque_time_series_regression/torque_regression_tinpu_quant.py. And I modified it to be compatible with my input size and dataset.

Here again the same loss of performance occurs.TI_example.zip

0 Gal Pascual 3 months ago

Prodigy 10 points

I solved the issue by forcing a MinMax scaling of the input data and labels to a range that is perfectly dividable by a power of 2. This is a constraint of the NPU to only use bit shifting to scale.

C2000™︎ microcontrollers

C2000 microcontrollers forum

TMS320F28P550SG: Quantization destroy performance of model