This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Hardware Prefetch on c6678 DSP

Hello,

I am trying to turn the hardware prefetch off on the c6678 DSP. The main reason I would like to turn it off is because the memory access pattern for referring an array is is dynamically irregular, i.e. with random memory access pattern. I suppose the hardware prefetch may not help in this case but simply saturate the memory bandwidth. The code snippet of turning the hardware prefetcher off I am wring is:

// Where it is defined

11 void disable_hardware_prefetch(uint8_t first, uint8_t last) {
12 volatile unsigned int *MAR = (volatile unsigned int *)0x1848000;
13
14 #define PREFETCH (1<<3)
15
16 // Disable Hardware Prefetch
17 int i;
18 for (i = first; i <= last; i++) {
19 MAR[i] &= ~PREFETCH;
20 }
21 }

And I use the routine before accessing the array origx, the one I would like to turn the hardware prefetch off

disable_hardware_prefetch((unsigned)&origx >> 24, ((unsigned)(&origx + MAX_n) - 1) >> 24);

 

The compilation works fine. Unfortunately, I did not see any performance improvement, which is out of my expect. 

So do I miss something?

Thanks

Cheng

  • Cheng,

    Do you see significant lower performance than expected when pre-fetch is enabled?  Also, are you doing the memory access patterns across all the cores or just one or two cores?

    While with an extremely random access pattern across all cores you can start having an access saturation point, you will have much less of an issue as your number of cores with the random pre-fetches is reduced.

    Also, the amount of randomness even across all cores may not be to a point that you start observing the saturation, and thus. You may not see improved but actually see degradation in performance by turning off the prefetching. (i.e. if this is a single core doing random access w/ random data pattern, turning off the prefetching would likely result in worse performance.)