Hi!
In our project FPGA write to MSMC of c6678 through PCIE. L1D ,L1P are enabled and L2 is disabled. In c6678 swi routine we do L1D cache invalidate on part of MSMC ram. Data in MSMC ram is aligned to 128.
We used sys bios function:
Cache_inv((void *)(pcie_bar1_channel+(4*512*task_struct[i].pcie_num)), 64*4, Cache_Type_L1D, TRUE);
The Latency of this code varies from 2000 clocks to 10000 clocks (measured with Timestamp_get32()).
To avoid sys bios overhead I've wrote my own inline function:
#define L1DIBAR_ptr ( unsigned int* )( 0x01844048u)
#define L1DIWC_ptr ( unsigned int* )(0x0184404Cu)
inline void InvalidateL1D_wait ( unsigned int a_bar, unsigned int size){
*(L1DIBAR_ptr+0) =a_bar;
*(L1DIWC_ptr+0) =size;
_mfence();
_mfence();
}
But the situation did not improved greatly -latency of this code varies from 800 clocks to 9000 clocks on 64*4 invalidate length.
Why can CPU invalidate cache for such a long time?
Thanks!