Hello,
In a TMS320 C66x, we can reach until 8 32-bit MAC/CYCLE, but the core can't load more than 128-bit/CYCLE, it would be more efficient if we could load 256-bit/CYCLE, is that normal/the case in all VLIW processors, that the computation power < load bandwidth ?
Thanks