MCU-PLUS-SDK-AM243X: IPC: RPMessage gets stuck when on high load

Felix Heil

Hello,

we are having continuously problems with the IPC. When having a lot of calls in a small amount of time it gets stuck. We land here:

all the time. it's only for 2 cores. We use SDK 9. IPC has 8 Buffers with 512 Bytes each.

To emulate a "high" load, we created 5 tasks which all send in a 1 ms-intervall and a task on the other side which does two calls in a 1 ms-interval.

Normally I would expect that a task waits until the other side has again a free buffer. I also noticed such a case as I expected. But what I did not expect is this. There is an interrupt happening which tries to send a message into an already full fifo. From both sides.

Ashwin once helped me to get a better understanding of the IPC here: https://e2e.ti.com/support/microcontrollers/arm-based-microcontrollers-group/arm-based-microcontrollers/f/arm-based-microcontrollers-forum/1286480/mcu-plus-sdk-am243x-does-ipcnotify-have-a-problem-when-messages-are-sent-from-both-sides-at-once/4938712?tisearch=e2e-sitesearch&keymatch=%25252520user%2525253A453845#4938712

I see the sense for RPMessage with the Semaphore being posted after receiving an "buffer free"-message, so we know there is a free space and the other core can continue sending. But what happens here is at IpcNotify-Level, which seems to deadlock itself since it tries to put something into the mailbox-fifo which is already full.

I ensured we are not sending any packages out of an interrupt context by throwing an assert if we are in HwiP_inISR() != 0. This did never throw. So all packages are sent out of task context. I also applied the fix by ashwin from the other thread. And even if so: let's assume one core is blocking any interrupts for a longer time: the other core will then see a full buffer or (?) a full Ipc Mailbox-Fifo. But then it must wait until the other core finishes his interrupt, gets the ipc-interrupt and works on the packages, frees the mailbox fifo and the other core can continue to put any ipc notify-messages into the mailbox. That's what I would expect.

It seems there is an inconsistency between RPMessage and IpcNotify.

This is a blocking issue for us, since we are more and more utilizing the IPC and this already occurs with two cores.

Best regards

Felix

over 1 year ago

0 Felix Heil over 1 year ago

Expert 1175 points

I think we found the issue.

a problem in our design was that the routine which takes and stores the packages into an internal buffer is also available from the own core's context.

That means that previously we just looked at it accessing from task-context. to ensure memory safets we used a scheduler-suspend when putting packages on the buffer since it can be accessed by multiple tasks. And that was the problem. calling a scheduler suspend from interrupt-context seems to produce this issue. I changed the section in question to a HwiP_disable (and restore) and now everything seems to work fine.
Unfortunately this was not really noticable by the stack-frame since it seemed the Ipc-mailboxes got full.

Arm-based microcontrollers

Arm-based microcontrollers forum

MCU-PLUS-SDK-AM243X: IPC: RPMessage gets stuck when on high load