Why C6455 has larger cache misses penalty than C6416?
For example, the cache misses for C6455 is 12.5 cycles per miss but C6416 has only 6 cycles per miss. How comes?
This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Why C6455 has larger cache misses penalty than C6416?
For example, the cache misses for C6455 is 12.5 cycles per miss but C6416 has only 6 cycles per miss. How comes?
The 64x+ architecture moves to a multi-master DMA architecture. This allows for much greater overall data throughput in the device, e.g. EDMA could move data from DDR2 to L1D SRAM while the CPU simultaneously accesses another chunk of on-chip memory (e.g. the 128 KB chunk of SRAM on OMAP-L137). Of course this adds an extra layer to things because now there needs to be arbitration of these sources.
The 64x+ architecture also was designed to scale to faster frequencies and larger cache sizes.
Although the miss penalty is greater there have also been other improvements that more than offset the change (generally):