F28035 RAM code execution performance vs flash

Sgio Hdz.

Initial condition: Code was executed directly from flash. It includes SPI reads, RAM storage/read, additions and multiplications using float32 x float32 variables whom are stored into vectors and arrays. Variables are always handled in RAM, results are in RAM too.

Change: By using the CCS flash was erased, the code was stored into RAM, and then executed.

The comparison showed that procesing time was improved ONLY around 11~12%.

Question & Help: I was expecting the performance to improve around 35~45% since the waitstates were supose to be reduced by the half when running from flash to RAM, is this correct?.

How can I improve the procesing time of RAM execution vs FLASH execution?. By the main time modifying the code is not under discussion.

Thanks

over 12 years ago

0 Gautam Iyer over 12 years ago

Guru 192875 points

Hi,

The most efficient way according to me is only running the time critical through ram rest in flash. Secondly about waitstates reduced by half in Ram, that doesn't improve the performance substantially. As mentioned by you 11 to 12 percent, this is pretty much true. So don't expect too much of improvement via Ram execution.

Regards,

Gautam

0 Devin Cottier over 12 years ago in reply to Gautam Iyer

TI__Guru 60865 points

Sgio,

There is a small flash pre-fetch buffer (called the flash pipeline in the below documentation). When executing linear code, this can generally keep the CPU fed with instructions. Branches can cause the buffer to flush, incurring some memory stall proportional to the flash wait-states. Effective performance is roughly inversely proportional to the number of branches in a given chunk of code.

Because of this, you can actually get very good performance when executing from flash. Echoing Guatam though, I would recommend that you move critical inner loops to SRAM; it is probably worth the time.

C2000™︎ microcontrollers

C2000 microcontrollers forum

F28035 RAM code execution performance vs flash