This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

F28035 RAM code execution performance vs flash

Initial condition: Code was executed directly from flash. It includes SPI reads, RAM storage/read, additions and multiplications using float32 x float32 variables whom are stored into vectors and arrays. Variables are always handled in RAM, results are in RAM too.

Change: By using the CCS flash was erased, the code was stored into RAM, and then executed.

The comparison showed that procesing time was improved ONLY around 11~12%.

Question & Help: I was expecting the performance to improve around 35~45% since the waitstates were supose to be reduced by the half when running from flash to RAM, is this correct?.

How can I improve the procesing time of RAM execution vs FLASH execution?. By the main time modifying the code is not under discussion.

Thanks

  • Hi,

    The most efficient way according to me is only running the time critical through ram rest in flash.  Secondly about waitstates reduced by half in Ram, that doesn't improve the performance substantially. As mentioned by you 11 to 12 percent,  this is pretty much true. So don't expect too much of improvement via Ram execution. 

    Regards, 

    Gautam

  • Sgio,

    There is a small flash pre-fetch buffer (called the flash pipeline in the below documentation). When executing linear code, this can generally keep the CPU fed with instructions.  Branches can cause the buffer to flush, incurring some memory stall proportional to the flash wait-states.  Effective performance is roughly inversely proportional to the number of branches in a given chunk of code.  

    Because of this, you can actually get very good performance when executing from flash.  Echoing Guatam though, I would recommend that you move critical inner loops to SRAM; it is probably worth the time.