We have two products based off of the DM355EVM. I am now trying to implement HD video capture 720p@20FPS. We are running the DM355 at 216 MHz and the DDR2 at 171 MHz. This should give us a theoretical max memory bandwidth of (171 * 2 * 16)/ 8 = 684 MB/sec. My application shouldn't be coming up against that memory limit. But I keep seeing things that appear to be memory starvation.
I found this application to test memory bandwidth. I compiled it and ran it on the two products we have made and the DM355EVM. The results of that memory test are shown below:
root@MV0007:~# ./bandwidth-arm
This is bandwidth version 0.23d.
Copyright (C) 2005-2010 by Zack T Smith.
This software is covered by the GNU Public License.
It is provided AS-IS, use at your own risk.
See the file COPYING for more information.
Using 32-bit transfers.
Notation: kB = 1024 B, MB = 1048576 B.
Sequential read (32-bit), size = 256 B, loops = 14417920, 701.6 MB/s
Sequential read (32-bit), size = 512 B, loops = 6815744, 655.9 MB/s
Sequential read (32-bit), size = 768 B, loops = 4543812, 662.0 MB/s
Sequential read (32-bit), size = 1 kB, loops = 3407872, 665.2 MB/s
Sequential read (32-bit), size = 2 kB, loops = 1736704, 669.4 MB/s
Sequential read (32-bit), size = 3 kB, loops = 1157785, 670.8 MB/s
Sequential read (32-bit), size = 4 kB, loops = 868352, 669.7 MB/s
Sequential read (32-bit), size = 6 kB, loops = 578866, 666.4 MB/s
Sequential read (32-bit), size = 8 kB, loops = 425984, 659.7 MB/s
Sequential read (32-bit), size = 12 kB, loops = 76454, 169.6 MB/s
Sequential read (32-bit), size = 16 kB, loops = 45056, 136.4 MB/s
Sequential read (32-bit), size = 20 kB, loops = 32760, 124.6 MB/s
Sequential read (32-bit), size = 24 kB, loops = 27300, 116.4 MB/s
Sequential read (32-bit), size = 28 kB, loops = 23400, 116.3 MB/s
Sequential read (32-bit), size = 32 kB, loops = 20480, 116.2 MB/s
Sequential read (32-bit), size = 40 kB, loops = 16380, 116.2 MB/s
Sequential read (32-bit), size = 48 kB, loops = 13650, 116.2 MB/s
Sequential read (32-bit), size = 64 kB, loops = 10240, 116.2 MB/s
Sequential read (32-bit), size = 128 kB, loops = 5120, 116.3 MB/s
Sequential read (32-bit), size = 192 kB, loops = 3410, 116.5 MB/s
Sequential read (32-bit), size = 256 kB, loops = 2560, 116.3 MB/s
Sequential read (32-bit), size = 384 kB, loops = 1700, 114.9 MB/s
Sequential read (32-bit), size = 512 kB, loops = 1152, 114.9 MB/s
Sequential read (32-bit), size = 768 kB, loops = 680, 91.2 MB/s
Sequential read (32-bit), size = 1 MB, loops = 576, 109.1 MB/s
Sequential read (32-bit), size = 1.25 MB, loops = 459, 113.4 MB/s
Sequential read (32-bit), size = 1.5 MB, loops = 420, 114.9 MB/s
Sequential read (32-bit), size = 1.75 MB, loops = 360, 114.9 MB/s
Sequential read (32-bit), size = 2 MB, loops = 288, 114.9 MB/s
Sequential read (32-bit), size = 2.25 MB, loops = 280, 114.9 MB/s
Sequential read (32-bit), size = 2.5 MB, loops = 250, 114.9 MB/s
Sequential read (32-bit), size = 2.75 MB, loops = 230, 114.9 MB/s
Sequential read (32-bit), size = 3 MB, loops = 210, 114.9 MB/s
Sequential read (32-bit), size = 4 MB, loops = 144, 114.9 MB/s
Sequential read (32-bit), size = 5 MB, loops = 120, 114.9 MB/s
Sequential read (32-bit), size = 6 MB, loops = 100, 114.9 MB/s
Random read (32-bit), size = 256 B, loops = 12058624, 586.2 MB/s
Random read (32-bit), size = 512 B, loops = 6160384, 594.6 MB/s
Random read (32-bit), size = 768 B, loops = 4106907, 597.5 MB/s
Random read (32-bit), size = 1 kB, loops = 3080192, 599.4 MB/s
Random read (32-bit), size = 2 kB, loops = 1540096, 601.4 MB/s
Random read (32-bit), size = 3 kB, loops = 1026715, 601.2 MB/s
Random read (32-bit), size = 4 kB, loops = 770048, 600.4 MB/s
Random read (32-bit), size = 6 kB, loops = 513334, 595.9 MB/s
Random read (32-bit), size = 8 kB, loops = 327680, 508.1 MB/s
Random read (32-bit), size = 12 kB, loops = 60071, 131.7 MB/s
Random read (32-bit), size = 16 kB, loops = 36864, 105.7 MB/s
Random read (32-bit), size = 20 kB, loops = 26208, 97.0 MB/s
Random read (32-bit), size = 24 kB, loops = 21840, 93.4 MB/s
Random read (32-bit), size = 28 kB, loops = 18720, 91.5 MB/s
Random read (32-bit), size = 32 kB, loops = 16384, 90.6 MB/s
Random read (32-bit), size = 40 kB, loops = 11466, 89.5 MB/s
Random read (32-bit), size = 48 kB, loops = 9555, 89.1 MB/s
Random read (32-bit), size = 64 kB, loops = 7168, 89.0 MB/s
Random read (32-bit), size = 128 kB, loops = 3584, 88.5 MB/s
Random read (32-bit), size = 192 kB, loops = 2387, 88.2 MB/s
Random read (32-bit), size = 256 kB, loops = 1792, 87.8 MB/s
Random read (32-bit), size = 384 kB, loops = 1190, 82.6 MB/s
Random read (32-bit), size = 512 kB, loops = 896, 80.8 MB/s
Random read (32-bit), size = 768 kB, loops = 510, 75.1 MB/s
Random read (32-bit), size = 1 MB, loops = 384, 74.2 MB/s
Random read (32-bit), size = 1.25 MB, loops = 306, 73.7 MB/s
Random read (32-bit), size = 1.5 MB, loops = 252, 73.3 MB/s
Random read (32-bit), size = 1.75 MB, loops = 216, 73.0 MB/s
Random read (32-bit), size = 2 MB, loops = 192, 72.9 MB/s
Random read (32-bit), size = 2.25 MB, loops = 168, 72.7 MB/s
Random read (32-bit), size = 2.5 MB, loops = 150, 72.7 MB/s
Random read (32-bit), size = 2.75 MB, loops = 138, 72.5 MB/s
Random read (32-bit), size = 3 MB, loops = 126, 72.5 MB/s
Random read (32-bit), size = 4 MB, loops = 96, 72.2 MB/s
Random read (32-bit), size = 5 MB, loops = 72, 71.9 MB/s
Random read (32-bit), size = 6 MB, loops = 60, 71.7 MB/s
Sequential write (32-bit), size = 256 B, loops = 8912896, 427.7 MB/s
Sequential write (32-bit), size = 512 B, loops = 4456448, 427.6 MB/s
Sequential write (32-bit), size = 768 B, loops = 2970954, 427.7 MB/s
Sequential write (32-bit), size = 1 kB, loops = 2228224, 428.1 MB/s
Sequential write (32-bit), size = 2 kB, loops = 1114112, 427.6 MB/s
Sequential write (32-bit), size = 3 kB, loops = 742730, 427.5 MB/s
Sequential write (32-bit), size = 4 kB, loops = 557056, 427.0 MB/s
Sequential write (32-bit), size = 6 kB, loops = 371348, 427.5 MB/s
Sequential write (32-bit), size = 8 kB, loops = 278528, 426.8 MB/s
Sequential write (32-bit), size = 12 kB, loops = 185674, 427.0 MB/s
Sequential write (32-bit), size = 16 kB, loops = 139264, 426.9 MB/s
Sequential write (32-bit), size = 20 kB, loops = 111384, 427.1 MB/s
Sequential write (32-bit), size = 24 kB, loops = 92820, 426.8 MB/s
Sequential write (32-bit), size = 28 kB, loops = 79560, 427.2 MB/s
Sequential write (32-bit), size = 32 kB, loops = 69632, 427.0 MB/s
Sequential write (32-bit), size = 40 kB, loops = 55692, 427.1 MB/s
Sequential write (32-bit), size = 48 kB, loops = 46410, 427.3 MB/s
Sequential write (32-bit), size = 64 kB, loops = 34816, 427.2 MB/s
Sequential write (32-bit), size = 128 kB, loops = 17408, 427.2 MB/s
Sequential write (32-bit), size = 192 kB, loops = 11594, 426.9 MB/s
Sequential write (32-bit), size = 256 kB, loops = 8704, 425.6 MB/s
Sequential write (32-bit), size = 384 kB, loops = 5440, 404.5 MB/s
Sequential write (32-bit), size = 512 kB, loops = 4096, 404.6 MB/s
Sequential write (32-bit), size = 768 kB, loops = 2890, 432.0 MB/s
Sequential write (32-bit), size = 1 MB, loops = 2176, 432.0 MB/s
Sequential write (32-bit), size = 1.25 MB, loops = 1734, 431.7 MB/s
Sequential write (32-bit), size = 1.5 MB, loops = 1470, 431.9 MB/s
Sequential write (32-bit), size = 1.75 MB, loops = 1260, 431.5 MB/s
Sequential write (32-bit), size = 2 MB, loops = 1088, 431.5 MB/s
Sequential write (32-bit), size = 2.25 MB, loops = 980, 431.5 MB/s
Sequential write (32-bit), size = 2.5 MB, loops = 875, 431.2 MB/s
Sequential write (32-bit), size = 2.75 MB, loops = 805, 431.3 MB/s
Sequential write (32-bit), size = 3 MB, loops = 735, 431.0 MB/s
Sequential write (32-bit), size = 4 MB, loops = 544, 430.5 MB/s
Sequential write (32-bit), size = 5 MB, loops = 432, 429.8 MB/s
Sequential write (32-bit), size = 6 MB, loops = 360, 429.3 MB/s
Random write (32-bit), size = 256 B, loops = 4194304, 199.7 MB/s
Random write (32-bit), size = 512 B, loops = 2097152, 199.7 MB/s
Random write (32-bit), size = 768 B, loops = 1398096, 202.9 MB/s
Random write (32-bit), size = 1 kB, loops = 1048576, 200.9 MB/s
Random write (32-bit), size = 2 kB, loops = 524288, 200.7 MB/s
Random write (32-bit), size = 3 kB, loops = 349520, 199.9 MB/s
Random write (32-bit), size = 4 kB, loops = 262144, 199.8 MB/s
Random write (32-bit), size = 6 kB, loops = 174752, 199.4 MB/s
Random write (32-bit), size = 8 kB, loops = 131072, 199.0 MB/s
Random write (32-bit), size = 12 kB, loops = 87376, 199.0 MB/s
Random write (32-bit), size = 16 kB, loops = 65536, 199.1 MB/s
Random write (32-bit), size = 20 kB, loops = 52416, 199.1 MB/s
Random write (32-bit), size = 24 kB, loops = 43680, 199.1 MB/s
Random write (32-bit), size = 28 kB, loops = 37440, 199.2 MB/s
Random write (32-bit), size = 32 kB, loops = 32768, 199.2 MB/s
Random write (32-bit), size = 40 kB, loops = 26208, 199.3 MB/s
Random write (32-bit), size = 48 kB, loops = 21840, 199.3 MB/s
Random write (32-bit), size = 64 kB, loops = 16384, 199.3 MB/s
Random write (32-bit), size = 128 kB, loops = 8192, 199.2 MB/s
Random write (32-bit), size = 192 kB, loops = 5456, 198.9 MB/s
Random write (32-bit), size = 256 kB, loops = 4096, 196.4 MB/s
Random write (32-bit), size = 384 kB, loops = 2380, 171.9 MB/s
Random write (32-bit), size = 512 kB, loops = 1664, 160.2 MB/s
Random write (32-bit), size = 768 kB, loops = 1020, 149.5 MB/s
Random write (32-bit), size = 1 MB, loops = 768, 145.4 MB/s
Random write (32-bit), size = 1.25 MB, loops = 612, 141.9 MB/s
Random write (32-bit), size = 1.5 MB, loops = 504, 141.2 MB/s
Random write (32-bit), size = 1.75 MB, loops = 432, 140.4 MB/s
Random write (32-bit), size = 2 MB, loops = 352, 138.6 MB/s
Random write (32-bit), size = 2.25 MB, loops = 336, 139.7 MB/s
Random write (32-bit), size = 2.5 MB, loops = 300, 139.2 MB/s
Random write (32-bit), size = 2.75 MB, loops = 253, 138.5 MB/s
Random write (32-bit), size = 3 MB, loops = 231, 138.0 MB/s
Random write (32-bit), size = 4 MB, loops = 176, 136.4 MB/s
Random write (32-bit), size = 5 MB, loops = 144, 135.8 MB/s
Random write (32-bit), size = 6 MB, loops = 120, 135.1 MB/s
Main register to main register transfers (32-bit) 674.0 MB/s
Stack-to-register transfers (32-bit) 683.2 MB/s
Register-to-stack transfers (32-bit) 227.6 MB/s
Library: memset 193.5 MB/s
As can be seen, the read performance is terrible. If I am topping out at 116 MB/sec, than I am definitely having memory starvation problems. Write performance seems ok, but read performance is miserable. We are using a Micron MT47H128M16-3 part. We are using UBL to initialize the DDR timing. The code that does the initialization is shown below:
/* per GEL file */
LPSCTransition(LPSC_DDR2, PSC_ENABLE);
SYSTEM->VTPIOCR &= 0xFFFFDF3F;// Clear bit CLRZ & PWRDN & LOCK bit(bit 13/6/7)
SYSTEM->VTPIOCR |= 0x00002000; // Set bit CLRZ (bit 13)
while(!(SYSTEM->VTPIOCR & 0x8000));
SYSTEM->VTPIOCR |= 0x00004000; // Set bit VTP_IO_READY(bit 14)
SYSTEM->VTPIOCR |= 0x00000180; // Set bit LOCK(bit 7) and PWRSAVE (bit 8)
SYSTEM->VTPIOCR |= 0x00000040; // Powerdown VTP as it is locked (bit 6)
waitloop(11*33); // Wait for calibration to complete
/* DDR2 controller initialization */
DDR->DDRPHYCR = 0x51006494; //External DQS gating enabled
LPSCTransition(LPSC_DDR2, PSC_SYNCRESET);
LPSCTransition(LPSC_DDR2, PSC_ENABLE);
DDR->PBBPR = 0x000000FE; //VBUSM Burst Priority Register, pr_old_count = 0xFE
DDR->SDBCR = 0x0000C632; //Program SDRAM Bank Config Register
DDR->SDTIMR0 = 0x2A923249; //Program SDRAM Timing Control Register1
DDR->SDTIMR1 = 0x3c17C763; //Program SDRAM Timing Control Register2
DDR->SDBCR = 0x00004632; //Program SDRAM Bank Config Register
DDR->SDRCR = 0x00000535; //Program SDRAM Refresh Control Register
The one thing I noticed is the T_XSNR parameter appears to be wrong (too fast) and the T_RFC also appears to be incorrect. Would that have the large effect on the read performance. Has anyone else seen problems like this?
Thank you.