This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Problem with higher CPU Clock frequency(360MHz, 372MHz,420MHz or 444MHz) in OMAPL138

Other Parts Discussed in Thread: OMAPL138

Hi,

We have made Custom boards using OMAPL138 Chip-set. This Chip-set is rated to operate at 450MHz(CPU Frequency).

Till now we are using this with DaVinci PSP 03.20.00.11 SDK.
- linux version is 2.6.33-rc4
- u-boot version is U-Boot 2009.11
- Compiler version is gcc version 4.3.3 (Sourcery G++ Lite 2009q1-203)

The Boards we are using, runs with external clock source as 24MHz.

We are inserting custom modules in the linux which will access uPP module frequently. We are observing two problems

1. uPP underflow error - If clock Source of ASYNC3 domain is kept as PLL1.
2. Random kernel crashes - If clock Source of ASYNC3 domain is kept as PLL0.

If CPU Clock frequency(Sourced from PLL0) is configured to 300MHz, everything is  working fine.

But If we move to Higher CPU clock(i.e 360MHz, 372MHz,420MHz or 444MHz), we were seeing uPP under-run problem in most of the boards.
but some boards were working fine without uPP under-run problem even at 444MHz.

By default, in linux, clock Source of ASYNC3 domain was moved from PLL0 to PLL1(in SOURCE_DIR/arch/arm/mach-davinci/da850.c) during linux boot-up. So as a fix for uPP under-run problem, we tried not to move the clock Source of ASYNC3 domain to PLL1 but used PLL0 as clock Source of ASYNC3 domain.

With this change we didn't see any uPP under-run problem, but we are seeing random kernel crashes in all the boards at clock higher than 300Mhz.

One of the crash log has been attached. If we move the ASYNC source from PLL0 to PLL1, then this random crash is not observed, but uPP under-run issue is observed.

Unable to handle kernel NULL pointer dereference at virtual address 00000004
pgd = c0004000
[00000004] *pgd=00000000
Internal error: Oops: 17 [#1] PREEMPT
last sysfs file: /sys/devices/virtual/gpio/gpio67/value
Modules linked in: WranNetDrv wran_mac upp I2C_wrapper Timer_module DCXO mmap_app_mac mcbsp_glue EDMA_wrapper Freq_Change [last unloaded: Freq_Change]
CPU: 0    Not tainted  (2.6.33-rc4 #903)
PC is at apply_to_page_range+0x114/0x214
LR is at getnstimeofday+0x80/0xf4
pc : [<c0089194>]    lr : [<c0061034>]    psr: 20000093
sp : c046bf70  ip : 29aaaa7d  fp : 00000000
r10: 00000000  r9 : 41069265  r8 : c046bf90
r7 : c04a8ea0  r6 : 00071270  r5 : 00000000  r4 : ffffffff
r3 : 49e655e6  r2 : 00000018  r1 : 00002333  r0 : b7d96b28
Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 0005317f  Table: c6b50000  DAC: 00000017
Process swapper (pid: 0, stack limit = 0xc046a270)
Stack: (0xc046bf70 to 0xc046c000)
bf60:                                     c046bfa8 c0472b18 c0027014 c046e3e8
bf80: c04bccd8 41069265 c00255b8 c00610dc 49e655e6 1d164a00 c0472b0c c003a77c
bfa0: c0472b18 c0472b88 c046bfb8 c0472b78 c0472b88 c0269318 c046a000 c04a2b5c
bfc0: c0027014 c046e3e8 c00255ec c002feac c04aa498 c0008948 c0008488 00000000
bfe0: 00000000 c0027018 00000000 00053175 c04a2c04 c0008034 00000000 00000000
[<c0089194>] (apply_to_page_range+0x114/0x214) from [<c0472b88>] (0xc0472b88)
Code: ebfffc62 e3500000 1a000037 e5973000 (e59a0004) 
10.10.10.3: seq=---[ end trace eb36b9db90d84c75 ]---
183 ttl=64 time=Kernel panic - not syncing: Attempted to kill the idle task!
82.236 ms
[<c0033640>] (unwind_backtrace+0x0/0xdc) from [<c0347024>] (panic+0x58/0x130)
[<c0347024>] (panic+0x58/0x130) from [<c0046224>] (do_exit+0x68/0x694)
[<c0046224>] (do_exit+0x68/0x694) from [<c0032554>] (die+0x290/0x2c4)
[<c0032554>] (die+0x290/0x2c4) from [<c0034220>] (__do_kernel_fault+0x64/0x74)
[<c0034220>] (__do_kernel_fault+0x64/0x74) from [<c00343f4>] (do_page_fault+0x1c4/0x1d8)
[<c00343f4>] (do_page_fault+0x1c4/0x1d8) from [<c002e2c0>] (do_DataAbort+0x34/0x94)
[<c002e2c0>] (do_DataAbort+0x34/0x94) from [<c002ea6c>] (__dabt_svc+0x4c/0x60)
Exception stack(0xc046bf28 to 0xc046bf70)
Mac [/Sw/tvws/sathish/dhaval_fw/Wran_stack/ieeemac80222/src/bs/../L1Con/BSModFrameMarker.c BSModFrameMarkerISR:340] : allocDmaRequest returned NULL
bf20:                   b7d96b28 00002333 00000018 49e655e6 ffffffff 00000000
bf40: 00071270 c04a8ea0 c046bf90 41069265 00000000 00000000 29aaaa7d c046bf70
bf60: c0061034 c0089194 20000093 ffffffff
[<c002ea6c>] (__dabt_svc+0x4c/0x60) from [<c0089194>] (apply_to_page_range+0x114/0x214)
[<c0089194>] (apply_to_page_range+0x114/0x214) from [<c0472b88>] (0xc0472b88)

In summary, problems observed at higher clocks(anything other than 300MHz ):
1. uPP underflow error - If clock Source of ASYNC3 domain is kept as PLL1.
2. Random kernel crashes - If clock Source of ASYNC3 domain is kept as PLL0.

Please let us know if any thing is missed out or anyone has observed similar issues, any pointers to help resolve these issues is highly appreciated.

NOTE:

All Frequency change are verified with the OMAP clocking validation spreadsheet provided in TI website downloaded from

http://processors.wiki.ti.com/index.php/Programming_PLL_Controllers_on_OMAP-L1x8/C674x/AM18xx

  • Dear Sathish,

    Which mode you are operating the uPP, TX or RX ?
    What is the DDR clock frequency (PLL1) ?
    If you are operating uPP in transmit mode, TX clock should not exceed the uPP module clock i.e 150MHz.

    Please make sure that the uPP I/O speed is within 75 MHz by changing the value of UPICR.CLKDIV register.

    The fixed divisor in uPP clocking architecture restricts the max. speed to one fourth of the CPU clock, but you have to change the UPICR.CLKDIV value accordingly to limit the max. speed within 75 MHz.

    For the underrun issue, please refer to the OMAPL138 TRM, page no 1537, chapter 33.2.8.4.

    This error should primarily occur when operating the uPP at high speed with significant system loading. To
    avoid this error, run the uPP at slower speeds or reduce background activity, such as non-uPP peripheral
    or DMA transactions. Additional tuning tips are given in Section 33.2.6.3.

    As per this note, it seems to be expected behavior when we increase the speed, also try to tune the register as mentioned in chapter 33.2.6.3

    Data rate, CLKDIV, UPTCR registers...

    Are you not getting the kernel panic when you haven't insert the uPP module driver ?


    Exception stack(0xc046bf28 to 0xc046bf70)
    Mac [/Sw/tvws/sathish/dhaval_fw/Wran_stack/ieeemac80222/src/bs/../L1Con/BSModFrameMarker.c BSModFrameMarkerISR:340] : allocDmaRequest returned NULL

    Can you check the return while DMA resource allocation ?
  • Hi titus,

    Thank you for the quick response.

    I will answer your questions one by one. Questions are bolded.

    Which mode you are operating the uPP, TX or RX ?
        uPP will be operated in both Tx and Rx mode in our custom module.

    What is the DDR clock frequency (PLL1) ?
       PLL1 is configured to give 300MHz. so DDR Clock frequency(MCLK) is 150MHz.

    Please make sure that the uPP I/O speed is within 75 MHz by changing the value of UPICR.CLKDIV register
        We made sure that uPP clcking is well below 75 MHz
        UPICR.CLKDIV = 4.
        for Ex, 
        For PLL0 Frequency - 420MHz
        uPP transit clock is sourced from PLL0_SYSCLK2.
        so. transit clock - 420/2 = 210MHz 
        Internally uPP module do divide by 2 of the transit clock. so clock becomes 105MHz
       We kept Divisor value as 4. so final clock value will be 105/(4+1) - 21MHz.

    Are you not getting the kernel panic when you haven't insert the uPP module driver ?

        We tried not to run any application after linux boot up. In that case it is not crashing.
        But if that is the case, system will always be in idle state. So we are not sure crash will happen in that case.
        We will try to create an application to load the system without involving uPP module access.

    Can you check the return while DMA resource allocation ?

        Exception stack(0xc046bf28 to 0xc046bf70)
        Mac [/Sw/tvws/sathish/dhaval_fw/Wran_stack/ieeemac80222/src/bs/../L1Con/BSModFrameMarker.c BSModFrameMarkerISR:340] : allocDmaRequest returned NULL

        We are doing uPP Tx and Rx between OMAPL138 and an external DSP. We have synchronisation mechanism for uPP Tx and Rx.
        The above print means that that synchronisation is lost. This is because of the kernel crash.

    tunning of uPP register is mentioned in 32.2.6.3 right?

    We tried all the combination of  tunning of registers mentioned in the System Tuning Tips.

    The crash that We attached in the post,  is it related to any miss configuration of DDR Registers?

  • Hi,

    One more thing. In the tries of increasing CPU frequency(Change PLL0 multipiler value), PLL1 frequency was kept same. So, DDR frequency is not altered regardless of CPU frequency change. Will this have any impact?

  • One more thing. In the tries of increasing CPU frequency(Change PLL0 multipiler value), PLL1 frequency was kept same. So, DDR frequency is not altered regardless of CPU frequency change. Will this have any impact?

    No, DDR will operate at 150MHz (max) on OMAPL138.
  • Hi,

    We have tested several combination of clock frequency of CPU as well as DDR.

    What we observed was, if the CPU clock(PLL0_SYSCLK1) and DDR Clock(MCLK = 2*PLL1_SYSCLK1) are in the ratio of 4:1, Then the system is stable. We are not observing any random kernel crashes.

    For Ex. if,
    CPU clock(PLL0_SYSCLK1) = 456MHz
    DDR Clock(MCLK = 2*PLL1_SYSCLK1) = 114MHz
    Here CPU clock and DDR MCLK are in the ratio 4:1. In this case no kernel crash was observed.
    Likewise, all such combinations(CPU clock:DDR Clock = 4:1) are working fine.

    What is the reason for this behaviour?

    We couldn't run the DDR controller at its Max supported clock if we increase the CPU Clock.

    Are we missing any sort of configurations?

  • Thanks Sathish for your observations.
    Are you using any buffer access with DDR in uPP module driver ?
    Let me discuss with internal uPP experts and update you.
  • Hi titus,

    Yes we are using ping pong buffer in both Tx and Rx of the uPP transaction. Each ping pong buffer is the DMA'ble memory(in linux) in the DDR.

  • Hi,

    Still we are not able to resolve the problem. Did you guys found anything that we missed out ? These crashes are seems to be related to schedule timing. Still debugging further. 

  • Dear Sathish,
    We are discussing internally for this issue.

    2. Random kernel crashes - If clock Source of ASYNC3 domain is kept as PLL0.

    You are getting kernel crash if you are using uPP driver right (inserted uPP module driver) ?
  • Dear Sathish,

    I hope you are operating the board with CVDD voltage at 1.3V, as CVDD should be 1.3 when you operate the CPU more than 400MHz.


    But If we move to Higher CPU clock(i.e 360MHz, 372MHz,420MHz or 444MHz), we were seeing uPP under-run problem in most of the boards.
    but some boards were working fine without uPP under-run problem even at 444MHz.


    Where did you change the frequency and how you are changing it ?

    Can you change the frequency through CPUfreq governor ?
    processors.wiki.ti.com/.../OMAPL1:_Changing_the_Operating_Point
  • Hi titus,

    We configure the clock from UBL U-BOOT (From the AisGen.exe application provided by TI). We tried to change the frequency through CPUFreq governor. But that is also giving the same result.(Random kernel crash).

  • Thanks.
    If you are changing the frequency in UBL or u-boot, then you should disable the freq configuration in kernel.
    How about the CVDD (core voltage) ?
  • Hi,

    The CVDD  voltage is kept around 1.28V. This crash is seen when our module is loaded. briefly speaking, our module will access 3 interrupts.

    1. A Mainthread will submit a uPP DMA for Tx. Then MainThread wait for interruptible.

    2. uPP ISR Get called after completion of uPP Tx.

    3. GPIO Interrupt will come once in every 10ms.

    2. From GPIO Interrupt, Mainthread is will be wake up from interruptible. Then From GPIO ISR, uPP DMA for Rx is submitted.

    5. Then uPP ISR get called after the completion of Rx.

    6. Then this cycle continues. In every 10ms, there will be appx. 4 uPP Tx transaction and  1 uPP Rx transaction will happen.

    This is the rough skeleton of our module.

    Thanks and regards,

    Sathish V

  • Hi titus,

    Thanks for the quick response. We are not aware of disabling the freq configuration in kernel if freq is confibured in UBL. We will try that. But, Without changing the Frequency from UBL uboot(kept it default to 300MHz), We tried to change the Freq from Governor. Then Also we observed the same result.


  • Thanks for the quick response. We are not aware of disabling the freq configuration in kernel if freq is confibured in UBL. We will try that. But, Without changing the Frequency from UBL uboot(kept it default to 300MHz), We tried to change the Freq from Governor. Then Also we observed the same result.

    Thanks, let me check with internal team for further debug points.

    Also, can you please try out our newer linux kernel version 3.3 ?
  • Hi,
    The kernel version what we are using now is linux 2.6.33 rc4. We are trying to use kernel 2.6.34. Is this a stable release?
  • This is the our latest release for OMAPL138 which has kernel version 3.3
    software-dl.ti.com/.../index_FDS.html