CPU idle and PM firmware on AM3352 / random slowdown

Jan Altenberg

Other Parts Discussed in Thread: AM3352

Hi all,

we're currently experiencing a strange issue on an AM3352 based platform.
We're using a vanilla Linux Kernel with realtime extension: Linux-3.12.11-rt17

Additionally we're running the PM firmware from:
git://arago-project.org/git/projects/am33x-cm3.git
based on commit:
32cf44e25b5828b87af6dceebc3a49fed5d858ac

When activating CPU idle support we can see a random slowdown
of the system by factor 20-30 (we did a couple of measurements
using the tracing infrastructure).
Usually handling a timer interrupt takes round about 25us. When
the problem triggers this takes up to 650us and theres no idle time
left (usually we have 85% idle time)!!

There's no runaway task or anything like that, the system just slows
down! This definitely just happens with CPU idle and the PM firmware.
(I can provide some traces, if needed).

Any ideas what might go wrong here? Did someone else encounter
the same problem before? Are there any known issues with the PM
firmware?

Cheers,
Jan

over 10 years ago

0 Biser Gatchev-XID over 10 years ago

TI__Guru**** 393215 points

Hi Jan,

Have you seen the same issue with the TI distributed EZSDK 7.0? We support only official TI releases on this forum.

0 Gangadhar Gangu over 10 years ago in reply to Biser Gatchev-XID

Expert 1865 points

Hi guys,

We are facing same problem. Using TI Sitara SDK 7.0 linux 3.12 for AM33xx. Has anyone found solution for this? Please let me know. This happens when CONFIG_CPU_IDLE=Y and CONFIG_NO_HZ=Y. Right now found temporary solution by making CONFIG_CPU_IDLE=N.

Biser, please get a linux technical person from TI to answer this question.

Regards,

Gangadhar

0 Biser Gatchev-XID over 10 years ago in reply to Gangadhar Gangu

TI__Guru**** 393215 points

See this wiki: http://processors.wiki.ti.com/index.php/Linux_Core_Power_Management_User%27s_Guide Frequency does get scaled down when the MPU is idle, so nothing abnormal there.

0 Jan Altenberg over 10 years ago in reply to Biser Gatchev-XID

Prodigy 20 points

I know, that the frequency is scaled down, when the CPU is going idle..... The problem is, that the frequency is NOT scaled up again, if the system gets back from idle....that's the problem here!!!

0 Biser Gatchev-XID over 10 years ago in reply to Jan Altenberg

TI__Guru**** 393215 points

Jan,

Is this Linux-3.12.11-rt17 or the TI distributed EZSDK 7.0 (kernel 3.12.10)? We support only official TI releases on this forum.

0 Steve Kipisz over 10 years ago in reply to Biser Gatchev-XID

TI__Genius 15510 points

For the Linux-3.12.11-rt17, can you post where you got the kernel from, any patches, and where the rt patch came from?

Steve K.

0 Steve Kipisz over 10 years ago in reply to Gangadhar Gangu

TI__Genius 15510 points

Gangadhar - can you tell me what programs are running when this happens? Also, can you do a

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

And

cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq

before and after the processor seems to slow down.

Steve K.

0 Gangadhar Gangu over 10 years ago in reply to Steve Kipisz

Expert 1865 points

Hi Steve,

1. We are running gstreamer based application. It does not look like application matters, as we've noticed this for other applications as well.

2. Before and after processor slows down,

root@am335x-evm:~# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
ondemand
root@am335x-evm:~# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
300000

3. I also want a patch which solves below problem,

with dpll mpu = 300MHz set in uboot, I get below kernel prints,
cpufreq_cpu0: failed to scale voltage up: -22

with dpll mpu = 800MHz,
cpufreq_cpu0: failed to scale voltage down: -22

Changing the frequency of the CPU clock requires changing the frequency of the PLL that is
supplying the CPU clock. To change the frequency of the PLL, the CPU
clock is temporarily reparented to another parent clock.
The clock frequency of this temporary parent clock could be much
higher than the clock frequency of the PLL at the time of
reparenting. Due to the temporary increase in the CPU clock speed,
the CPU (and any other components in the CPU clock domain such as
dividers, mux, etc.) have to to be operated at a higher voltage
level, called the safe voltage level. 

I want a patch that adds optional support to temporarily switch to a safe voltage level during CPU
frequency transitions.


Regards,
Gangadhar

0 Frank Walzer over 10 years ago in reply to Gangadhar Gangu

TI__Mastermind 44106 points

Hi Gangadhar,

it seems you 'hijacked' the thread :-). Please open a new forum entry for your issue.

regards,

0 Gangadhar Gangu over 10 years ago in reply to Frank Walzer

Expert 1865 points

Hi Frank,

Started a new thread - http://e2e.ti.com/support/arm/sitara_arm/f/791/t/374960.aspx. Please get someone to look into this. I want this to be solved ASAP.

Regards,

Gangadhar

0 Bob Koen over 10 years ago in reply to Steve Kipisz

Intellectual 445 points

I have seen this problem also and wonder if TI has found a solution. We are running Arago and kernel 3.12.10. This is the TI supported version of the Linux distribution.

When the CPU goes into the slowdown mode the response to cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq remains at 600000. But the DPLL has been put into bypass mode as indicated by 0x5 in the CM_CLKMODE_DPLL_MPU register. This is bypass mode. The value should be 0x7 or Lock mode. Setting it back to Lock mode using devmem2 fixes the problem, until the next time it occurs.

Disabling the CPU Idle function and rebuilding the kernel keeps the problem from happening.

There is a related post here...

e2e.ti.com/.../401052

0 Bob Koen over 10 years ago in reply to Bob Koen

Intellectual 445 points

I notice in arch/arm/mach-omap2/dpll3xxx.c in function omap3_noncore_dpll_set_rate() there is a comment that says "FIXME - this is all wrong". Could that possibly be related?

0 Matthijs van Duin over 10 years ago in reply to Bob Koen

Mastermind 8030 points

Bob Koen said:
I notice in arch/arm/mach-omap2/dpll3xxx.c in function omap3_noncore_dpll_set_rate() there is a comment that says "FIXME - this is all wrong". Could that possibly be related?

Although that comment no longer appears in current mainline linux, it can't be related anyhow: dpll3xxx.c (and in general anything labeled "3xxx" rather than e.g. "33xx") only applies to omap3 (omap34xx, omap35xx, omap36xx, dm37xx, am37xx) and am35xx processors. The am335x is much newer and, together with its sibling am437x, has rather unique PRCM (read: a mess).

Processors

Processors forum

CPU idle and PM firmware on AM3352 / random slowdown