In the TRM "6.3.3.5 Executing a PROGRAM Operation", the last step mentions:
Following programming of the flash memory, it is possible that there may be stale data in the processor's
cache and prefetch logic. Before reading locations which were programmed, it is recommended to first flush
the cache in the CPU subsystem.
How is this done? I tried clearing the ICACHE and PREFETCH bits in CPUSS.CTL after programming flash, but it appears I still get stale data (as a byte-by-byte memory check fails, but succeeds if run a second time).