A question about C66x architecture

Nios Ensa

Hello,

In a TMS320 C66x, we can reach until 8 32-bit MAC/CYCLE, but the core can't load more than 128-bit/CYCLE, it would be more efficient if we could load 256-bit/CYCLE, is that normal/the case in all VLIW processors, that the computation power < load bandwidth ?

Thanks

over 13 years ago

0 Nios Ensa over 13 years ago

Intellectual 300 points

A practical example is when performing floating complex dot multiplies over 2 arrays, then 2 complex numbers (each of 64-bit) are loaded during 1 cycle, and we use .M1 for a complex product between the 2 loaded numbers, and then .M2 won't have new data loaded to treat data too in that same cycle ..

I just want to know if in general most existing VLIW presents the same issue, and want to understand more what is limiting designers to make the data bus wider (is it L1D memory) ?

Thanks

0 Nios Ensa over 13 years ago in reply to Nios Ensa

Intellectual 300 points

No one to share his knowledge ?

I can see the that TigerSHARK DSP of AD presents wider data bus of 256-bit, while its computation power is not as powerful as the TI C66x ..

Is that not important ?

Processors

Processors forum

A question about C66x architecture