This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hi,
I'm trying to use different quantization bit when importing models.
And I found the document:
http://software-dl.ti.com/jacinto7/esd/processor-sdk-rtos-jacinto7/06_01_01_12/exports/docs/tidl_j7_01_00_01_00/ti_dl/docs/user_guide_html/md_tidl_model_import.html
It indicates these two terms:
numParamBits:8bit,Bit depth for model parameters like Kernel, Bias etc. Max supported is 16. Any value > 8 is experimental in the current release.
numFeatureBits:8bit,Bit depth for Layer activation; Max supported is 16. Any value > 8 is experimental in the current release.
I just want to verify if numParamBits is for model parameter and numFeatureBits is for layer feature map value quantization.
Also, these two parameters are set in import.txt.
For example :
numParamBits = 16
numFeatureBits = 16
Other pages mention that "16-bit and 32-bit will come with significant performance penalty".
Shall we use 8-bit quantization rather than 16-bit in a usual case?
Thank you,
Kevin.
Yes, we recommend using 8-bit quantization for the best runtime.
Please follow the below document if you find any issue with accuracy while using 8-bits