PROCESSOR-SDK-AM68A: Issues when using QATv2 in PSDK9.2

Youngheon Ro

Tool/software:

Hi,

While progressing with QATv2, there are two issues:

(1) init() function within QATv2 needs to have add_methods=True, but it is implemented with a False value, which is causing problems. (first image)

(2) After forcibly modifying the TI code to True, a second error occurred. This part involves the detailed implementation of layer QAT, which is difficult for us to handle, so we have not been able to proceed. (second image)

First image:

Second image:

Thank you,

Youngheon

over 1 year ago

0 Fabiana Jaimes over 1 year ago

TI__Mastermind 19670 points

Hi Youngheon,

Due to a US holiday, there will a delay in my response. Thanks for understanding.

-Fabiana

0 Parakh Agarwal over 1 year ago

TI__Prodigy 40 points

We will be needing some information from your side to address this issue:

The git branch for edgeai-modeloptimization tool?
The repository and model details for quantization?
The training technique / script to reproduce the issue.

0 hyuntak lim over 1 year ago

Prodigy 30 points

Hi. I'm going to continue the question on behalf of youngheon.

1. The branch is latest. I used this command to install package

pip3 install "git+https://github.com/TexasInstruments/edgeai-tensorlab.git#subdirectory=edgeai-modeloptimization/torchmodelopt"

2. the model was a simple convolutional network. I attach the network.

class testnet(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv0 = nn.Conv2d(3, 32, 3, padding=1, stride=2)
        self.bn0 = nn.BatchNorm2d(32)
        self.relu0 = nn.ReLU()
        
        self.conv1 = nn.Conv2d(32, 32, 3, padding=1, stride=2)
        self.bn1 = nn.BatchNorm2d(32)
        self.relu1 = nn.ReLU()
        
        self.conv2 = nn.Conv2d(32, 32, 3, padding=1, stride=2)
        self.bn2 = nn.BatchNorm2d(32)
        self.relu2 = nn.ReLU()
        
        self.conv3 = nn.Conv2d(32, 32, 3, padding=1, stride=2)
        self.bn3 = nn.BatchNorm2d(32)
        self.relu3 = nn.ReLU()
        
        self.conv4 = nn.Conv2d(32, 32, 3, padding=1, stride=2)
        self.bn4 = nn.BatchNorm2d(32)
        self.relu4 = nn.ReLU()
        
        self.conv5 = nn.Conv2d(32, 2, 3, padding=1, stride=2)
        self.bn5 = nn.BatchNorm2d(2)
        self.relu5 = nn.ReLU()
        self.avg_pool5 = nn.AdaptiveAvgPool2d(1)
        
        self.conv6 = nn.Conv2d(32, 3, 3, padding=1, stride=2)
        self.bn6 = nn.BatchNorm2d(3)
        self.relu6 = nn.ReLU()
        self.avg_pool6 = nn.AdaptiveAvgPool2d(1)
        
        self.conv7 = nn.Conv2d(32, 6, 3, padding=1, stride=2)
        self.bn7 = nn.BatchNorm2d(6)
        self.relu7 = nn.ReLU()
        self.avg_pool7 = nn.AdaptiveAvgPool2d(1)
        
        def forward(self, x):
            x = self.conv0(x)
            x = self.bn0(x)
            x = self.relu0(x)
        
            x = self.conv1(x)
            x = self.bn1(x)
            x = self.relu1(x)
        
            x = self.conv2(x)
            x = self.bn2(x)
            x = self.relu2(x)
        
            x = self.conv3(x)
            x = self.bn3(x)
            x = self.relu3(x)
        
            x = self.conv4(x)
            x = self.bn4(x)
            x1 = self.relu4(x)

            output1 = x
        
            x = self.conv5(x1)
            x = self.bn5(x)
            x = self.relu5(x)
            x = self.avg_pool5(x)
        
            output2 = x
        
            x = self.conv6(x1)
            x = self.bn6(x)
            x = self.relu6(x)
            x = self.avg_pool6(x)
        
            output3 = x
        
            x = self.conv7(x1)
            x = self.bn7(x)
            x = self.relu7(x)
            x = self.avg_pool7(x)
        
            output4 = x
    
        return output1, output2, output3, output4

3. I just converted the model with v2.QATFxModule(as descibed in https://github.com/TexasInstruments/edgeai-tensorlab/blob/main/edgeai-modeloptimization/torchmodelopt/edgeai_torchmodelopt/xmodelopt/quantization/v2/docs/qat.md) and after wrapping up the model, i tried to test the model.

model = testnet()
model = edgeai_torchmodelopt.xmodelopt.quantization.v2.QATFxModule(model, total_epochs=100)
model.eval()

test_model(model, test_dataset)

0 Parakh Agarwal over 1 year ago

TI__Prodigy 40 points

Hi,

I am sorry for the late response. The issue related to add_methods will be fixed in the next push, thanks for letting us know about it, however, for the second part of the question:

The quantization parameters for QAT needs to be and will be updated during training, and without that step, you will not be able to use those parameters, as an example for this case, the below script should work:

model = testnet()
model = edgeai_torchmodelopt.xmodelopt.quantization.v2.QATFxModule(model, total_epochs=100)
model.train()

loss_fn = nn.CrossEntropyLoss() # not sure about your loss function, defined a small one
optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

for i in range(100):
    inp, label = torch.rand((1,3,224,224)), torch.ones(1,6,1,1)
    optimizer.zero_grad()
    output = model(inp)
    loss = loss_fn(output[3], label) 
    loss.backward()
    optimizer.step()
    
model.eval()

print(model(torch.rand((1,3,224,224))

Processors

Processors forum

PROCESSOR-SDK-AM68A: Issues when using QATv2 in PSDK9.2