This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

  • TI Thinks Resolved

Linux/TMS320DM8168: Stuck at Starting Kernel after Multiple Reset Cycles

Prodigy 80 points

Replies: 2

Views: 104

Part Number: TMS320DM8168

Tool/software: Linux

Hello,

The Problem:

When applying multiple hardware resets, the board comes up after the last reset, printing up to "Starting kernel ...", and nothing more.

Further resets, do not bring the system out from this condition, the system is "stuck" after the "Starting Kernel" message.

The resets are hardware pulses to the Netra's Reset input pin, applied externally from another processor board that can be programmed to do it sequencially.

Power cycle bring the system back to an operating condition.

The Platform:

It is a TI’s DM8168, Davinci "Netra" SOC, design.

The software is based on the DVRRDK-04.01.00.02, adapted to the board’s design and application requirements.

Linux kernel version 2.6.37, arago project: http://arago-project.org/git/projects/?p=linux-dvr-rdk-dm81xx.git;a=commit;h=607df36e37bae28aeba426f65782eb219dd3651e

Findings:

  • Stuck point:

The CPU wasn’t really stuck right after "Starting kernel ...", it was stuck much later, there were simply no printouts thus there was no way to know where it was.

Investigating, it was found that the CPU got all the way to the endless loop in "cpu_idle()", but the loop executed 101 times, before stopping.

  • Printouts:

Printouts are not output immediately upon startup. They are registered and output thru the serial port much later, when getting to "cpu_idle()".

The Test:

Reset pulses are applied to the board sequentially, at a programmed rate, one reset pulse per ~22 seconds. This way the reset is applied at approximately at the same point in Linux’s startup, about just before calling the "cpu_idle()" function.

With 20 reset cycles after power-up, the system will surely be "stuck". As the number of reset cycles decreases, the system will fail less and start OK more.

When testing there are printouts up to "Starting kernel ..." for each reset (including the last one), when there is the failure.

When there is no failure, there are normal printouts after the last reset.

The Patch:

Inserting a delay into the cpu_idle() endless loop (the first step in the main loop is a counting sub-loop), improved the chance of non-failures (none at 200 reset cycles).

The delay was set to a count of 2500000 for the first 512 loops, and then disabled for the rest of Linux’s operation duration.

Setting the delay to 0,1, or 100000, brought the failures back (at 20 reset cycles).

Files (from linux-dvr-rdk-dm81xx.git):

  • Main.c:

This file contain the function "rest_init()", the function calling "cpu_idle" at startup completion.

The testing reset pulses stops the startup at this function just before calling cpu_idle().

location: /init/main.c:

void rest_init(void) {

        ….

        preempt_enable_no_resched();

        //This is the last point arrived at, when applying a test reset.

        schedule();

        //This is where we do not arrive at, when applying a test reset.

        preempt_disable();

        /* Call into cpu_idle with preempt disabled */

        cpu_idle();

}

  • Process.c:

This file contains the cpu_idle() function for the ARM architecture.

This function was changed to have a delay inserted to its endless loop.

location: /arch/arm/kernel/process.c

void cpu_idle(void) {

        int pcnt=0;

        local_fiq_enable();

        /* endless idle loop with no priority at all */

        while (1) {

//Added delay’s start.

          if(pcnt++ < 512) {

          volatile int n = 2500000;

while(n-- > 0) {

}

 }

//Added delay’s end.

tick_nohz_stop_sched_tick(1);

           …

        }

}

  • Hi Eyal,

    Can you reproduce this issue on the DM816x TI EVM TMDXEVM8168 ?

    Please apply all the patches from the below git tree:

    arago-project.org/.../

    Check also if below e2e threads will be in help:

    e2e.ti.com/.../760349
    e2e.ti.com/.../564201

    Regards,
    Pavel



  • In reply to Pavel Botev:

    Hello,

    1) I’ve tried to add the CONFIG_DEBUG_LL to no avail. There were no printouts at all, none were added.

    I think that something with the early_printk does not work because also in a good system startup none of those printouts were added.

    2) I’ll try to update the kernel to the latest in arago.

    3) The EVM or the Udworks DVRbox has the hw reset spplied by a button. I need to find/prepare a solution to apply this reset sequencially using a programmed period and count. I do not have such capability at the moment.

This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.