This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

am335x internal sram speed (continued...)

Other Parts Discussed in Thread: AM3359, AM3358

Hello,

I was told by James that the internal sram (not ocm) is 0 wait state (same speed as the core mpu_clk)

However, this does not seem to be the case. I have disabled the cache & mmu and set the core PLL to bypass. The cpu core is now running at 25Mhz.

When i execute 200 NOP instructions in internal sram, it takes 512 cycles (according to TIMER1 which I have enabled)

I would have expected something like 100 cycle execution time.

If I enable the core PLL to 500Mhz, I get the same behavior (512 cycles)

Does anyone have any ideas?? Am I being stupid?

Thanks,

Paul

  • I have an update on this. I tried the same thing, using the Cycle Count register (CCNT)

    Using CPU clock = 500Mhz:

    Executing 500 million NOPs from internal SRAM takes 2,000 million cycles (4 seconds). This is actually 4x longer than expected from this 32bit ram.

    What is going on? This memory is not cached so what is the trick to get it to run at CPU freq.

    My test code is simply looping 500,000 times through 1000 arm mode NOPs.

    Am I the only person trying to use internal sram for performance reasons?

    Cheers

    Paul

  • Hi Paul, I agree something doesn't jive.  What are you setting your MPU and CORE PLL registers to? 

    Regards,

    James

  • Paul, i have looked into this more and what i said previously was incorrect.  The only way you can get single cycle execution is by using the L1 instruction cache.  This is what is tightly coupled to the core.  Accesses without cache from the internal 64K SRAM have a latency of 20cycles.  Beyond that, accesses from L3 RAM (also 64K, but attached to the L3 interconnect) will have a latency of 40cycles.

    Sorry for the confusion.  Hope this clears things up.

    Regards,

    James

  • Hello,

    I was looking into the available RAM on the SoC so to place code there via .lds scripts (GCC build). It was a cause of confusion that all documentation for the am335x (in my case the am3359 ~ beaglebone) specify a 64KB on chip RAM. Which RAM is this referring to, the internal SRAM or the L3 OCMC0 RAM? Furthermore, the am335x TRM states that the OCMC is a wrapper for 2 memory mapped devices, a ROM and a RAM. No elaboration exists within the previous or the datasheet. Even worse, the "Flashing and Booting" page on the StarteWare wiki, on the part describing the operation of the bootloader, that the internal memory region from which it is loaded onto and executes, is a region starting from 0x40F0000 (not even 0x402F0400) and ending at 0x4030FFFF. What is happening here?

    Since the internal RAM is stated as "not exposed to the L3 interconnect", does this mean that both memories exist, but only one can be accessed by peripherals (i.e such s the EDMA) while the other only through the ARM core (running code)? If both do in fact exist as distinct memories, then why does the documentation not clarify this?

    Best Regards.

    Vassilis.

  • "Accesses without cache from the internal 64K SRAM have a latency of 20cycles.  Beyond that, accesses from L3 RAM (also 64K, but attached to the L3 interconnect) will have a latency of 40cycles."

    Hi,

    Could someone please clarify the above comment as I'm working on some AM3358 startup code that is running from L3 OCMC RAM and seems very sluggish despite the PLLs being configured for 600MHz operation. 

    The L1 and L2 caches are disabled, so does this mean that every L3 OCMC RAM access takes 40 cycles, i.e. effective processor speed will be reduced to approximately 600/40 = 15MHz?

    Thanks,

    Chris

     

     

     

  • Hi Biser,

    Thanks for the quick response, I'll look into activating the cache.

    This low performance is surprising.  Is the latency of the internal RAM specified anywhere in the AM3358 data sheets?

    Chris

     

  • No, its not in the datasheets. This is mostly interconnect latency. The L3 Fast interconnect runs at 200MHz. Caching should improve things a lot.

  • Hi,

    A quick follow-up question.  I've enabled the L1 cache but it's only doubled the effective processor speed.  Is it necessary to also activate the MMU in order for the L1 cache to work at maximum performance?

    Thanks.

     

     

  • I don't know. You can look at the AM335X Starterware bootloader for guidance.