Hello,
so I am currently investigating a problem we face with the IPC-implementation (RPMessage). We use two cores of the Mcu0-cluster and we set the Buffersize to 512 and 12 Buffers for the VRING. I cannot tell if it is really related to the TI-IPC-driver but I may need some help for better understanding. We use the SDK version 9. Also I implemented a fix supplied from here: e2e.ti.com/.../4938712
Edit: jump to the end to understand why we have 12 Buffers and how this seems to be the reason.
We did a performance-test for the IPC. The test sends packets between the cores which are stored on the other core each into a buffer with returning immediately. So we ensure there is no ipc-call inside an ipc-callback, which could potentially deadlock.
The cores have tasks with different prios (freeRTOS) which can send those packets. Those are like 5-6 tasks and they sleep in a range of 1-3 ms and then send a packet. for the problematic situation in question I added a counter on each side. So one core sends a counter-value which increases with each call and the other side also has a counter which increases with each call. both counters must match.
But instead like after 2:40 to 3:10 min it always throws an assert because the counter that is received is always (!) exactly 4 higher than the one on the receiving core. But it's not like a reproducable counter value. It ranges from 60000 to 80000 and is always somewhere in this range. This makes the whole thing a bit more weird.
The times reduce when I reduce the sleep-time of the tasks to a minimum of 1 ms. Then it occurs at like 1:10 min. but the counter-range is still the same.
So to me it seems there are 4 packets lost which are not handled by the receiving core. I thought it may be related to locked interrupts, but we eliminated all the sources for locking interrupts and thus the cores should not block anything. Anyway if this would be a problem it should've happened earlier.
Is there a recommended way to debug such a situation? because when I receive the value it's already to late for a breakpoint. And the ipc-send always returns fine without any issue. Logs are activated as the asserts also are. No warning by the IPC-driver so far.
Edit:
So we implemented the IPC-algorithm provided by SysCfg and its scripts which generate the c-files manually. Thus it was possible to also select 12 buffers instead of 2^n. I changed it back to 4 buffers and the problem does not occur anymore. At least currently, I am running a long-time test now. I guess the IPC-implementation can only handle 2^n buffers?
Best regards
Felix