This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

OMAP-L137 Shared Memory & DSPLINK.

Other Parts Discussed in Thread: OMAP-L137

On a OMAP-L137 target, running Linux 2.6.37 on ARM core and DSPBIOS 5.41.3.17 on DSP core,
an acquisition system is running, supported by Issue/Reclaim transfers according to DSPLINK is 1.65.00.010 version, which is based on SHARED RAM.
The DSP application  is running a thread issuing SIO_issue/reclaim pairs for an OUTPUT channel.
SIO ops have a timeout of 1 s.
The output buffer issued is mapped @ 0xC1F30880, size is 0x5000.

The ARM application is running a thread issuing CHNL_issue/reclaim pairs for an INPUT channel.  
CHNL_ ops have a timeout of 200 ms.
The input buffer reclaimed is mapped @ 0x40126880, size is 0x5000.

DSP output buffer is transferred to ARM input buffer roughly @ 64 Hz / 16 ms.

When the input buffer is available a second thread sends it to a client application running on a PC though TCP/IP.


On most of our boards the system works nicely for days.
On some boards (identically configured as HW/SW) after some time the ARM core hangs,
i.e. the Linux operating system itself does not respond any more.
When this happens:
the DSP core is running safely, only a timeout was recognized for the SIO_reclaim call.
After the timeout , DSP core code prevents further SIO_issue/recalim calls are performed, but the channel is still operational.

On the contrary, the ARM core is freezed just after the CHNL_reclaim call is issued: the call itself does not return.
Hanging means that the Linux kernel itself is no more working.

The fact that the hang happens in "faulty" boards if and only if a CHNL_Reclaim call is issued, could lead me believe that this is SW problem.
The fact that "working" boards NEVER hang, (and "faulty" board can work for a substantial time) leads me to believe that this is an HW problem, related to SHARED RAM management by the ARM core.

1)Is it possible that some OMAP-L137 device have a silicon bug on SHARED RAM?

2)How to address this problem ?

N.B.I tried to compile DSPLINK on ARM side with debug option, but the system slows down too much.
N.B. In no way I could enable selectively the different debug levels

I will send anyone whishing to support me whatever information is needed.

Thank you in advance for your attention.

Misha

  • Might not be related but I had similar hanging problems with MSGQ. I worked around it by busy polling instead of specifying a timeout to the DSPLINK call, I did this on the DSP side.

    Try a timeout of zero to CHNL_Reclaim() and call it again until success. Not sure which side it should be on. Even though the ARM side hangs, changing the DSP side might unhang the ARM side..