Tool/software:
Dear TI Team,
I am attempting to perform inference of an AI model, which includes convolutional layers, on a TI processor. The activation function used in the convolutional layers is GELU, but there is a significant issue with quantization error becoming very large in this part.
GELU is represented by the following equation.
f(x)=0.5x(1+tanh(√2/π(x+0.044715x^3)))
Due to the cubic term of x, the quantization scale value became very large, causing many elements of the tensor to become zero. I consider that either support for GELU in tidl-tools or floating-point operation capability in the accelerator is necessary. Additionally, a makeshift solution would be manually setting the quantization scale for this particular layer.
Are these methods feasible? Additionally, are there any other possible solutions?
I have attached the ONNX file for the convolution layers and the sample npy file for the model input.
regards,
Koki