Hi, i am using TMS320C6745.
I am seeing that most of functions of the official C674x dsplib are quite "general purpose", in the sense that they are compiled under few operating hypothesis. In particular, the batch size N is often supposed to be only multiple of two / four, so i expect that into such function there is something like a FOR loop executing the condition many times as N increases, once every 2 or four samples.
In my application i always need to handle data at larger batches (16). The question is: if i rewrite some of the dsplib functions (especially blk_move, w_vec, dotprod, vecmul) specifically for N = 16, will my function execute "always the same way" avoiding all the "FOR conditions"? Can i expect to have a significant increase of performances by rewriting dsplib specifically for N = 16?