This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28P550SG: Quantization destroy performance of model

Part Number: TMS320F28P550SG


Hello,

I have a pre-trained model that I want to use on NPU.

The issue is that when quantizing it with TINPUTinyMLQATFxModule there is a big drop in performance (the loss goes from 0.25 to 0.34) wheras this issue doesn't occur when I quantize it using torchao (loss goes from 0.25 to 0.26).

You can find in the zip the quantization scripts as well as the logs of the quantization. run_qat_ti.zip 

I am using a very simple model : 

        self.cnn_backbone = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=8, kernel_size=(3, 1), padding=(1, 0), stride=(2, 1)),
            nn.BatchNorm2d(num_features=8),
            nn.ReLU(),

            nn.Conv2d(in_channels=8, out_channels=16, kernel_size=(3, 1), padding=(1, 0), stride=(2, 1)),
            nn.BatchNorm2d(num_features=16),
            nn.ReLU(),

            nn.Conv2d(in_channels=16, out_channels=32, kernel_size=(3, 1), padding=(1, 0), stride=(1, 1)),
            nn.BatchNorm2d(num_features=32),
            nn.ReLU()
        )
        
        self.flatten = nn.Flatten()

        dummy_input = torch.randn(1, 1, input_size, 1)
        conv_output_size = self.cnn_backbone(dummy_input).view(1, -1).size(1)

        # MLP Head
        self.mlp_head = nn.Sequential(
            nn.Linear(conv_output_size, 64),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(64, output_size)
        )

Some models that had a lower loss, had an even bigger loss of performance when using TI's library for quantization. Do you have an idea why does it happen ?

Best,

Gal Pascual