This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-AM68A: Issues when using QATv2 in PSDK9.2

Part Number: PROCESSOR-SDK-AM68A


Tool/software:

Hi,

While progressing with QATv2, there are two issues:
(1) init() function within QATv2 needs to have add_methods=True, but it is implemented with a False value, which is causing problems. (first image)
(2) After forcibly modifying the TI code to True, a second error occurred. This part involves the detailed implementation of layer QAT, which is difficult for us to handle, so we have not been able to proceed. (second image)
First image:
Second image:
Thank you,
Youngheon
  • Hi Youngheon,

    Due to a US holiday, there will a delay in my response. Thanks for understanding.

    -Fabiana

  • We will be needing some information from your side to address this issue:

    1. The git branch for edgeai-modeloptimization tool?
    2. The repository and model details for quantization?
    3. The training technique / script to reproduce the issue.
  • Hi. I'm going to continue the question on behalf of youngheon.

    1. The branch is latest. I used this command to install package

    pip3 install "git+https://github.com/TexasInstruments/edgeai-tensorlab.git#subdirectory=edgeai-modeloptimization/torchmodelopt"

    2. the model was a simple convolutional network. I attach the network.

     

    class testnet(nn.Module):
        def __init__(self):
            super().__init__()
            self.conv0 = nn.Conv2d(3, 32, 3, padding=1, stride=2)
            self.bn0 = nn.BatchNorm2d(32)
            self.relu0 = nn.ReLU()
            
            self.conv1 = nn.Conv2d(32, 32, 3, padding=1, stride=2)
            self.bn1 = nn.BatchNorm2d(32)
            self.relu1 = nn.ReLU()
            
            self.conv2 = nn.Conv2d(32, 32, 3, padding=1, stride=2)
            self.bn2 = nn.BatchNorm2d(32)
            self.relu2 = nn.ReLU()
            
            self.conv3 = nn.Conv2d(32, 32, 3, padding=1, stride=2)
            self.bn3 = nn.BatchNorm2d(32)
            self.relu3 = nn.ReLU()
            
            self.conv4 = nn.Conv2d(32, 32, 3, padding=1, stride=2)
            self.bn4 = nn.BatchNorm2d(32)
            self.relu4 = nn.ReLU()
            
            self.conv5 = nn.Conv2d(32, 2, 3, padding=1, stride=2)
            self.bn5 = nn.BatchNorm2d(2)
            self.relu5 = nn.ReLU()
            self.avg_pool5 = nn.AdaptiveAvgPool2d(1)
            
            self.conv6 = nn.Conv2d(32, 3, 3, padding=1, stride=2)
            self.bn6 = nn.BatchNorm2d(3)
            self.relu6 = nn.ReLU()
            self.avg_pool6 = nn.AdaptiveAvgPool2d(1)
            
            self.conv7 = nn.Conv2d(32, 6, 3, padding=1, stride=2)
            self.bn7 = nn.BatchNorm2d(6)
            self.relu7 = nn.ReLU()
            self.avg_pool7 = nn.AdaptiveAvgPool2d(1)
            
            def forward(self, x):
                x = self.conv0(x)
                x = self.bn0(x)
                x = self.relu0(x)
            
                x = self.conv1(x)
                x = self.bn1(x)
                x = self.relu1(x)
            
                x = self.conv2(x)
                x = self.bn2(x)
                x = self.relu2(x)
            
                x = self.conv3(x)
                x = self.bn3(x)
                x = self.relu3(x)
            
                x = self.conv4(x)
                x = self.bn4(x)
                x1 = self.relu4(x)
    
                output1 = x
            
                x = self.conv5(x1)
                x = self.bn5(x)
                x = self.relu5(x)
                x = self.avg_pool5(x)
            
                output2 = x
            
                x = self.conv6(x1)
                x = self.bn6(x)
                x = self.relu6(x)
                x = self.avg_pool6(x)
            
                output3 = x
            
                x = self.conv7(x1)
                x = self.bn7(x)
                x = self.relu7(x)
                x = self.avg_pool7(x)
            
                output4 = x
        
            return output1, output2, output3, output4

    3. I just converted the model with v2.QATFxModule(as descibed in https://github.com/TexasInstruments/edgeai-tensorlab/blob/main/edgeai-modeloptimization/torchmodelopt/edgeai_torchmodelopt/xmodelopt/quantization/v2/docs/qat.md) and after wrapping up the model, i tried to test the model.

    model = testnet()
    model = edgeai_torchmodelopt.xmodelopt.quantization.v2.QATFxModule(model, total_epochs=100)
    model.eval()
    
    test_model(model, test_dataset)

  • Hi,

    I am sorry for the late response. The issue related to add_methods will be fixed in the next push, thanks for letting us know about it, however, for the second part of the question:

    The quantization parameters for QAT needs to be and will be updated during training, and without that step, you will not be able to use those parameters, as an example for this case, the below script should work:

    model = testnet()
    model = edgeai_torchmodelopt.xmodelopt.quantization.v2.QATFxModule(model, total_epochs=100)
    model.train()
    
    loss_fn = nn.CrossEntropyLoss() # not sure about your loss function, defined a small one
    optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
    
    for i in range(100):
        inp, label = torch.rand((1,3,224,224)), torch.ones(1,6,1,1)
        optimizer.zero_grad()
        output = model(inp)
        loss = loss_fn(output[3], label) 
        loss.backward()
        optimizer.step()
        
    model.eval()
    
    print(model(torch.rand((1,3,224,224))