Hello. I am experiencing occasional CPU lock-ups in the AM5K2E04, and I'm trying to work out why.
I've been reading the ARM errata... developer.arm.com/.../
I think that the most likely candidate is 814169: "A series of store or PLDW instructions hitting the L2 cache in shared state in an ACE system might cause a deadlock".
I have a few questions...
1) Is that plausible - is this SOC affected by this erratum? Or alternatively, are there any other known causes of CPU lockups?
2) If this erratum is relevant, is there any information about how the other elements of the SOC are connected to the ACE system? Are there caching masters in the system apart form the A15 CorePac? Which peripherals can put lines of the L2 cache into the required "shared state" and which can emit the relevant snoops?
3) Also, is there any mitigation for this in the chip's design or the Linux drivers? And if so, what is it?
4) Is it possible to disable the ACE interconnect? (If it fixes these hangs, I'd be prepared to do that even if it meant coherency had to be handled in software).
Any help for any of the above questions would be really useful. Many thanks!