TMS320C6678: L2 cache/ram memory access latency

Sung-IL

Mastermind 31210 points

Part Number: TMS320C6678

Dear Champs,

My customer is looking for the data of L2 memory latency.

Could you please let me know the L2 cache/ram, MSMC memory and external DDR3 memory latency of C6678?

And also, when core0 access to L2 cache/ram of core1, what is the memory latency in this case?

Thanks and Best Regards,

SI.

over 7 years ago

0 Yordan Kovachev over 7 years ago

TI__Guru**** 161600 points

Hi,

I've notified the RTOS team. They will post their feedback directly here.

Best Regards,
Yordan

0 jian35385 over 7 years ago in reply to Yordan Kovachev

TI__Mastermind 23125 points

Sung-IL,
There are multiple aspects, can you clarify if customer is trying to understand the following:
1. L2 cache miss latency
2. L2 RAM latecy for both code and data
3. MSMC RAM latency for both code and data
4. External DDR latency
jian

0 Sung-IL over 7 years ago in reply to jian35385

TI__Mastermind 31210 points

Hi Jian,

I found these data in 'Throughput Performance Guide', but did not find any data when access it of other cores.

Could you please let me know if the latency when core0 access L2 SRAM of core1 in C6678?

And, in the table5 Memory Write Performance of 'Throughput Performance Guide for TCI66x KeyStone Devices'(sprabh2a.pdf), Could you please explain why there is big difference between DDR RAM(SL2) and DDR RAM(SL3) for Victim?

And what is the difference between 'No Victim' and 'Victim' in this case?

Thanks and Best Regards,

SI.

0 jian35385 over 7 years ago in reply to Sung-IL

TI__Mastermind 23125 points

Sung-IL,

On the latency to access another core's L2, it will be similar to DDR RAM situation, as Core0 has to first go out to MSMC->TeraNet->Core1_L2, as shown in Figure 1 of the same document. For example, if Core1 L2 is configured as SL3 to Core0, it may be ~100 CPU cycles with burst write when both Core0 caches miss.

On the burst write difference between DDR-RAM SL2 vs. SL3 - SL3 mode, cache lines with victim in L2 are to be read back and merged, that's why you see a large number of 115.5 cycles. In SL2 mode, the L2 cache is disabled, so the L1D cache miss is passed directly to DDR. In this case, since L1D does not write allocate (it does not allocate a cache line), therefore, there is no read back and merging to L1 cache lines, that's why you see the burst write to DDRRAM-SL2 show no cycle difference.

let me know if this clear, or you need more precise cycle estimates for L2 access from another core. It is only practical in very limited software design to use neighbor's L2.

Jian

0 jian35385 over 7 years ago in reply to jian35385

TI__Mastermind 23125 points

Sung-IL,
i will close this ticket to comply with our internal tracking. please re-open if you have more questions.
Jian

Processors

Processors forum

TMS320C6678: L2 cache/ram memory access latency