Hello TI team,
I read the TI's paper "Sparse, Quantized, Full Frame CNN for Low Power Embedded Devices" in order to understand more how the quantization is done in the TIDL. I would be thankful if you could answer my questions about the paper.
1) After quantization, operations (multiplication and accumaltions) are done in the fixed-point format (say 8 bits). The results of these operations, however, need much more bit width. For instance, multiplying two 8 bit values may need up to 16 bits. How does TIDL solve this issue? The paper does not explain this point. I.e. during inference, is there a switching between floating and fixed point formats? How is that done?
2) In the TIDL, there is no training done after sparcification and quantization of the network. Is that OK? In the relevant literature including your paper, re-training is needed.
Thanks a lot
Best,
Safwan