Hello,
A benchmarked example shown in CE Multi-core Overhead Analysis suggests that cache maintenance is the significant overhead in the multi-core architecture. I see Step 4 is about 21000 microseconds (~ 95.0%) but this step is related to Activating, processing, deactivating the codec (from what I have understood is a fast process). Others steps including invalidation and write back buffers are about 2% of overhead (and I assume these are the lowest processes). Could anybody clarify this point?
Regards,
gaston