This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DM814x kernel hang on ifdown

Hi,

We've been seeing a kernel hang when the network interface is taken down, with ifdown.  We see this happen with both our custom board, and also with the DM814x-EVM using the stock SD card demo.  Nothing in user space appears to be running, no activity on serial ports, front panel buttons, no console messages either, but I can still see some partial signs of life from the lowest level USB serial gadget routines when the USB cable is unplugged, so the kernel is not completely dead.  It happens in the busy network environment of our corporate headquarters, but does not happen in our quiet satellite office.

Enabling the kernel's soft lock-up and hung task detection yielded nothing.

I'm currently at the latest EZSDK, plus all the latest ti81xx-master patches on Arago.  One of the changes sounded like it was for a similar problem, but I already have that fix: http://arago-project.org/git/projects/?p=linux-omap3.git;a=commit;h=9917117f06754a8c056d68ff81c0e52cc0c3fb25

Looking at this forum, I see there was a similar problem discussed here: http://e2e.ti.com/support/embedded/linux/f/354/p/216491/767846.aspx

So I wanted to know - are there other davince emac or cpsw fixes that haven't made it to ti81xx-master?  Where else should I look?

Thanks,

Dan -

  • I'll answer myself: the linux-omap3 repo at Arago is missing several mods present in the upstream kernel.org for preventing the "interrupt storm".  Of particular interest:

    net/cpsw: don't rely only on netif_running() to check which device is active
    https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=fd51cf199421197d14099b4ba382301cc28e5544

    net/cpsw: fix irq_disable() with threaded interrupts
    https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a11fbba9a7d338c4a4e4be624af0334bbf2c9a5a

    drivers: net: cpsw: fix kernel warn on cpsw irq enable
    https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7dcf313a7a68adc9a060e4e41a55245c0f9a3d31

    drivers: net: cpsw: irq not disabled in cpsw isr in particular sequence
    https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=af5c6df704af46f2cfebea329887f3d70ccb7b3d

    These fixed my problem.  Heaven only knows what other ticking time bombs may be in the Arago repo that have already been fixed upstream.

  • I found a similar problem in a custom board based on DM8148 as described also in this open forum:

    e2e.ti.com/.../1761646

    Can you share the files modified to solve your problem?

    Thanks a lot