Hi,
I'd like to know, compared to L2RAM on DSP, what the speed of L3 shared memory.
Thanks
This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hi,
I'd like to know, compared to L2RAM on DSP, what the speed of L3 shared memory.
Thanks
Hello,
You'll find pasted below some brute measurement of OMAP-L138 DSP RAM throughput for 32-bit access that may be useful (personal data, not TI validated, no guaranty).
Jakez
OMAP-L138 RAM 32-bit data access performance for DSP core (CPU rev. 2.0) |
||||||||
Measures done by averaging the execution time of 1000 calls to pipelined loops | ||||||||
operating sequential accesses to a buffer of 8K 32-bit words (32 KB) | ||||||||
Notes: | ||||||||
Write test : one write per buffer element, theorical min. 1 CPU cycle / element | ||||||||
Read test : one read per buffer element, theorical min. 1 CPU cycle / element | ||||||||
RMW test : one read & one write per buffer element, theorical min. 1 CPU cycle / element | ||||||||
DDR2 SDRAM : 16-bit data bus, CAS latency=3, 8 banks, 2 KB pages, 4ms refresh | ||||||||
DDR2 max throughput (bursts) : 3/10 32-bit word per CPU cycle = f(DDRCLK) / f(CPU) | ||||||||
f(CPU) = 450 MHz, f(DDRCLK)=135 MHz (270 MHz/2) | ||||||||
L1P cache active, ARM core & peripherals inactive, no interrupt | ||||||||
When configured as caches, L1D is full cache (32 KB), L2 cache is 64 KB | ||||||||
Data caches are emptied before each loop when used | ||||||||
Test loops functions prolog/epilog execution time included | ||||||||
Program is located in DDR2 SDRAM | ||||||||
Measurements through DSP Time Stamp Counter (TSC) | ||||||||
Performance unit : average CPU cycles / word; DDR2_CLK rel : average DDR_CLK cycle / word | ||||||||
Native RAM performance (32-bit) | ||||||||
L1D & L2 caches inactive - MARn registers are 0 (disabling cache on L3 & DDR2 domains) | ||||||||
Test type \ RAM | L1D | L2 | L3 | DDR2 | DDR2 (CLK rel) | |||
Read | 1.01 | 7.76 | 33.89 | 72.35 | 21.71 | |||
Write | 1.01 | 1.01 | 18.01 | 18.56 | 5.57 | |||
RMW | 1.01 | 11.38 | 54.25 | 108.49 | 32.55 | |||
Cached RAM performance (32-bit) | ||||||||
L1D & L2 caches active - MARn registers are 1 (enabling cache on L3 & DDR2 domains) | ||||||||
Test type \ RAM | L2 | L3 | DDR2 | DDR2 (CLK rel) | ||||
Read | 1.77 | 3.89 | 6.67 | 2.00 | ||||
Write | 1.14 | 8.88 | 11.63 | 3.49 | ||||
RMW | 1.87 | 3.94 | 6.68 | 2.00 | ||||
L1 cache only performance (32-bit) | ||||||||
L1D cache active only - MARn registers are 1 | ||||||||
Test type \ RAM | L2 | L3 | DDR2 | DDR2 (CLK rel) | ||||
Read | 1.84 | 4.72 | 8.76 | 2.63 | ||||
Write | 1.18 | 18.15 | 18.68 | 5.60 | ||||
RMW | 1.98 | 4.85 | 8.90 | 2.67 | ||||
L2 cache only performance (32-bit) | ||||||||
L2 cache active only - MARn registers are 1 | ||||||||
Measure type \ RAM | L1D | L2 | L3 | DDR2 | DDR2 (CLK rel) | |||
Read | 1.15 | 7.90 | 12.46 | 15.20 | 4.56 | |||
Write | 1.13 | 1.13 | 8.88 | 11.62 | 3.49 | |||
RMW | 1.17 | 11.54 | 18.23 | 21.02 | 6.31 | |||
Disabled caching RAM performance (near native performance) | ||||||||
L1D & L2 caches active but MARn registers are 0 | ||||||||
Test type \ RAM | L2 | L3 | DDR2 | DDR2 (CLK rel) | ||||
Read | 1.84 | 34.07 | 72.55 | 21.77 | ||||
Write | 1.17 | 18.14 | 18.69 | 5.61 | ||||
RMW | 1.98 | 54.44 | 108.68 | 32.60 | ||||
The funny one (should be native performance for L3 & DDR2) | ||||||||
L1D & L2 caches inactive but MARn registers are 1 | ||||||||
Test type \ RAM | L1D | L2 | L3 | DDR2 | DDR2 (CLK rel) | |||
Read | 1.01 | 7.76 | 56.01 | 120.56 | 36.17 | |||
Write | 1.01 | 1.01 | 18.01 | 18.56 | 5.57 | |||
RMW | 1.01 | 11.38 | 76.00 | 154.99 | 46.50 |