This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Stability issues with AMSDK 07.00 and am33xx_idle

Hello,

Since upgrading from SDK 06.00 to 07.00 we experience increased CPU usage and occurrences of the board becoming unresponsive when using ALSA on an EVM-SK board running an unmodified AMSDK 07.00 load.

The following 2 scenarios generally trigger the issue within a minute:

1.) arecord / aplay

Running a record and a separate play session on the board (the used EVM-SK has been modified for an added line-in jack) will result in both commands eventually starting to experience XRUNs after some time and the CPU usage increases to 100%.

2.) aplay / ping

Secondly using aplay on the EVM-SK while stressing the board with a flood of pings results in XRUNs and finally loss of audio.

We did a first quick ftrace on the second scenario and saw the davinci pcm dma irq callback occurring periodically as expected, however, aplay was not scheduled.

We see the EVM-SK previously entering sleep state C1 and the davinci pcm dma irq callback hit the idle task. With the wakeup call affecting aplay the need for rescheduling is flaged as expected, however, the CPU is busy processing a flood of hard/soft irqs until aplay is finally scheduled.

Following the indication that the problematic condition is at least favoured by idle handling, we were not able to observe the issue with either a Kernel built without PM idle support or the original AMSDK 07.00 kernel with cstates other than C0 disabled.

Interestingly the same issue also did not show up running the same test on a general purpose EVM board running an unmodified AMSDK 07.00. A quick look at /dev/cpu_dma_latency showed that the effective constraint prevents any cstate below C0 to be chosen. Once the EVM chooses C1/C2, the issue can be triggered on the EVM hardware too.

Best Regards,

Christian

  • I tried the second scenario you mentioned, I could not reproduce your test. Is there anything else running with the aplay and flood pings. Are you using a certain packet size with the flood pings? Is the network isolated? I was using 64k approximate packet sizes and a statically configured network, no other traffic. I used a wav file with aplay. During the test I ran I always seemed to have about 50% of the processor left. I tried this on both the 6.0 and 7.0 SDK.

  • For reproducing this scenario I used an EVM-SK board with an SD card that was just created by the official 7.0 SDK in a clean way. Additionally the chrt and powertop commands were taken from an Arago build and added to the card. The Starter Kit is connected to a 5V power supply, the uUSB for the console and a test network on eth0. This is not an isolated network for only the device under test and a devlopment machine, but a segment shared by more devices.

    On the otherwise idle board we play a raw file via

    while [ true ]; do chrt -rr 90 aplay -v --mmap --device=hw:0,0 --format=S16_LE --channels=2 --rate=8000 --buffer-size=256 --period-size=16 /dev/shm/playback.raw; done

    and use the following line for the flood ping:

    sudo ping <evm_sk> -s 8000 -f

    The board stays responsive under this load if we use eg.

    exec 3>/dev/cpu_dma_latency
    echo -ne '\0000\000\000\000' >&3

    before starting the flood and shows the issue once we call

    exec 3>&-

    Best Regards,

    Christian

  • Hello,

    In cpuidle33xx.c are you sure the correct ddr_states are being used?

  • Hello,

    When I had a short look at this in July I noted that the states visible via SysFs matched the ddr_states for DDR2 / DDR3 according to cpuidle33xx.c and the RAM types equipped on the EVM / EVM-SK used for the tests.

    Best Regards,

    Christian