I read from the 'Multicore programming guide' application report that there's no cache coherency between L1/L2 on one core and L1/L2 on another core. So if Core 0 read from or write to L1D SRAM in core1, cache coherency must be kept through the software, am I right? It is really ridiculous cache is not coherent when reading or write L1D SRAM.
In order to avoid the cache coherency problem, if I turn off the cache mapping for the other core's L1/L2 SRAM, I'd like to know the performance of reading the other core's L1D/L2 SRAM, i.e. the cycles needed to read/write one byte from/to the other core's L1D/L2. Could it be the same speed as the core reading the external memory such as SDRAM.
Thanks for any reply.