This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[J6]  serial8250: too much work for irq301

Hi ,

    We have our custom J6 board + SDK 7.04.00.03 ,  but J6 will sometimes come out "serial8250: too much work for irq301" during our  production line program.

And then J6 and console will dead . Do you have some advice for this issue ?

Attachment is the full log .

[  818.218834] serial8250_interrupt: 2638 callbacks suppressed
[  818.224432] serial8250: too much work for irq301
[  818.230884] serial8250: too much work for irq301
[  818.237320] serial8250: too much work for irq301
[  818.243763] serial8250: too much work for irq301
[  818.250199] serial8250: too much work for irq301
[  818.256642] serial8250: too much work for irq301
[  818.263085] serial8250: too much work for irq301
[  818.269522] serial8250: too much work for irq301
[  818.275964] serial8250: too much work for irq301
[  818.282422] serial8250: too much work for irq301
[  818.378149]  sda: sda1
[  818.423281] sd 4:0:0:0: [sda] Attached SCSI disk
[  823.239410] serial8250_interrupt: 2738 callbacks suppressed
[  823.245010] serial8250: too much work for irq301
[  823.251495] serial8250: too much work for irq301
[  823.257934] serial8250: too much work for irq301
[  823.264387] serial8250: too much work for irq301
[  823.270838] serial8250: too much work for irq301
[  823.277274] serial8250: too much work for irq301
[  823.283725] serial8250: too much work for irq301
[  823.290162] serial8250: too much work for irq301
[  823.296611] serial8250: too much work for irq301
[  823.303061] serial8250: too much work for irq301
[  828.260327] serial8250_interrupt: 2746 callbacks suppressed
[  828.265921] serial8250: too much work for irq301
[  828.272378] serial8250: too much work for irq301
[  828.278815] serial8250: too much work for irq301
[  828.285266] serial8250: too much work for irq301
[  828.291716] serial8250: too much work for irq301
[  828.298152] serial8250: too much work for irq301
[  828.304602] serial8250: too much work for irq301
[  828.311050] serial8250: too much work for irq301
[  828.317486] serial8250: too much work for irq301
[  828.323937] serial8250: too much work for irq301
[  833.280234] serial8250_interrupt: 2746 callbacks suppressed
[  833.285828] serial8250: too much work for irq301
[  833.292281] serial8250: too much work for irq301
[  833.298718] serial8250: too much work for irq301
[  833.305168] serial8250: too much work for irq301
[  833.311615] serial8250: too much work for irq301
[  833.318051] serial8250: too much work for irq301
[  833.324500] serial8250: too much work for irq301
[  833.330947] serial8250: too much work for irq301
[  833.337381] serial8250: too much work for irq301
[  833.343829] serial8250: too much work for irq301
[  838.300340] serial8250_interrupt: 2746 callbacks suppressed
[  838.305937] serial8250: too much work for irq301
[  838.312389] serial8250: too much work for irq301
[  838.318825] serial8250: too much work for irq301
[  838.325266] serial8250: too much work for irq301
[  838.331709] serial8250: too much work for irq301
[  838.338145] serial8250: too much work for irq301
[  838.344585] serial8250: too much work for irq301
[  838.351027] serial8250: too much work for irq301
[  838.357463] serial8250: too much work for irq301
[  838.363905] serial8250: too much work for irq301
[  839.430172] INFO: rcu_preempt self-detected stall on CPU { 0}  (t=2100 jiffies g=3376 c=3375 q=316)
[  839.439295] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G      D    O 3.14.63 #14
[  839.446372] Backtrace: 
[  839.448690] INFO: rcu_preempt detected stalls on CPUs/tasks: { 0} (detected by 1, t=2102 jiffies, g=3376, c=3375, q=316)
[  839.448693] Task dump for CPU 0:
[  839.448703] swapper/0       R running      0     0      0 0x00000002
[  839.448708] Backtrace: 
[  839.448727] [<c09aff6c>] (init_thread_union) from [<c09aff78>] (init_thread_union+0x1f78/0x2000)
[  839.448729] Backtrace aborted due to bad frame pointer <00000001>
[  839.486740] [<c0012138>] (dump_backtrace) from [<c00122d4>] (show_stack+0x18/0x1c)
[  839.494339]  r6:200f0193 r5:ffffffff r4:00000000 r3:00000000
[  839.500062] [<c00122bc>] (show_stack) from [<c06a1724>] (dump_stack+0x80/0xa0)
[  839.507322] [<c06a16a4>] (dump_stack) from [<c008dab0>] (rcu_check_callbacks+0x428/0x8f0)
[  839.515532]  r7:29db6000 r6:c09ab8e8 r5:c0a09840 r4:ea7618e8
[  839.521252] [<c008d688>] (rcu_check_callbacks) from [<c005150c>] (update_process_times+0x44/0x64)
[  839.530159]  r10:ea7613a0 r9:00000000 r8:00000000 r7:00000000 r6:c09ae000 r5:00000000
[  839.538060]  r4:c09ba1a8
[  839.540616] [<c00514c8>] (update_process_times) from [<c0098ad4>] (tick_sched_handle+0x50/0x5c)
[  839.549348]  r7:c09afe40 r6:ea761b08 r5:000000c3 r4:716bacb0
[  839.555066] [<c0098a84>] (tick_sched_handle) from [<c0098da0>] (tick_sched_timer+0x60/0x94)
[  839.563456] [<c0098d40>] (tick_sched_timer) from [<c0064fac>] (__run_hrtimer+0x4c/0xe0)
[  839.571491]  r7:c0098d40 r6:ea7613a0 r5:ea7613d8 r4:ea761b08
[  839.577208] [<c0064f60>] (__run_hrtimer) from [<c0065a70>] (hrtimer_interrupt+0x120/0x2b8)
[  839.585503]  r7:00000001 r6:ea7613d8 r5:000000c3 r4:716baa0a
[  839.591222] [<c0065950>] (hrtimer_interrupt) from [<c054a644>] (arch_timer_handler_virt+0x34/0x3c)
[  839.600216]  r10:00000001 r9:c0a5ae00 r8:ea765240 r7:00000013 r6:e982e640 r5:c0a1d484
[  839.608117]  r4:e982f0c0
[  839.610673] [<c054a610>] (arch_timer_handler_virt) from [<c0086a08>] (handle_percpu_devid_irq+0x88/0xa8)
[  839.620198] [<c0086980>] (handle_percpu_devid_irq) from [<c0082e28>] (generic_handle_irq+0x28/0x38)
[  839.629279]  r8:c09aff30 r7:00000013 r6:c09ae000 r5:c09abf78 r4:00000013 r3:c0086980
[  839.637098] [<c0082e00>] (generic_handle_irq) from [<c000f48c>] (handle_IRQ+0x54/0xb8)
[  839.645046]  r4:c09b6c74 r3:000001b5
[  839.648652] [<c000f438>] (handle_IRQ) from [<c0008660>] (gic_handle_irq+0x30/0x64)
[  839.656250]  r8:00000000 r7:fa212000 r6:c09afe40 r5:c09b6df8 r4:fa21200c r3:000000c0
[  839.664068] [<c0008630>] (gic_handle_irq) from [<c06a7240>] (__irq_svc+0x40/0x74)
[  839.671580] Exception stack(0xc09afe40 to 0xc09afe88)
[  839.676654] fe40: 00000000 c0a5d180 00000001 00000000 00000002 c09abf78 c09ae000 c09ae000
[  839.684866] fe60: 00000000 c0a5ae00 00000001 c09afecc c09afe88 c09afe88 c004aaa8 c004aab8
[  839.693077] fe80: 200f0113 ffffffff
[  839.696576]  r7:c09afe74 r6:ffffffff r5:200f0113 r4:c004aab8
[  839.702294] [<c004aa38>] (__do_softirq) from [<c004af0c>] (irq_exit+0xac/0xf8)
[  839.709544]  r10:00000001 r9:c0a5ae00 r8:00000000 r7:00000013 r6:c09ae000 r5:c09abf78
[  839.717445]  r4:c09ae000
[  839.719998] [<c004ae60>] (irq_exit) from [<c000f490>] (handle_IRQ+0x58/0xb8)
[  839.727074]  r4:c09b6c74 r3:000001b5
[  839.730681] [<c000f438>] (handle_IRQ) from [<c0008660>] (gic_handle_irq+0x30/0x64)
[  839.738280]  r8:c0a5ae00 r7:fa212000 r6:c09aff30 r5:c09b6df8 r4:fa21200c r3:000000c0
[  839.746097] [<c0008630>] (gic_handle_irq) from [<c06a7240>] (__irq_svc+0x40/0x74)
[  839.753610] Exception stack(0xc09aff30 to 0xc09aff78)
[  839.758681] ff20:                                     ffffffed 29db6000 c09b7790 c002f1c0
[  839.766894] ff40: c09ae000 00000000 c09b699c c06b220c c0a5ae00 c0a5ae00 00000001 c09aff84
[  839.775105] ff60: c09aff78 c09aff78 c002d44c c000f848 a00f0013 ffffffff
[  839.781744]  r7:c09aff64 r6:ffffffff r5:a00f0013 r4:c000f848
[  839.787463] [<c000f818>] (arch_cpu_idle) from [<c0082b24>] (cpu_startup_entry+0x68/0x158)
[  839.795677] [<c0082abc>] (cpu_startup_entry) from [<c069bd8c>] (rest_init+0x78/0x90)
[  839.803450]  r7:c0994840 r3:c09ae000
[  839.807059] [<c069bd14>] (rest_init) fr

214510012275_AB_20161105-202510.txt

  • Hi, Andy,

    I have notified SDK and Serial peripheral experts to comment.

    Regards,
    Mariya
  • HI Stan,
    I turn off the wakeup enable bit and then this situation disappear.
    Thanks for your help.

    Andy
  • Hi Stanislav;

    I have a customized board suffering from the same problem. I have performed power on and power down periodic booting test and could reproduce within two days.

    Hardware setup is 

    McASP3_AXR0 -> UART5_RXD

    McASP3_AXR1 -> UART5_TXD

    UART5 is weired to a GPS module. The GPS module transmits NEMA statements in 1Hz before ttyS4 is opened.

    The PAD CONF registers are read as bellow:

     devmem2  0x4a00372c

    /dev/mem opened.

    Memory mapped at address 0xb6f34000.

    Read at address  0x4A00372C (0xb6f3472c): 0x000D0004

    devmem2  0x4a003730

    /dev/mem opened.

    Memory mapped at address 0xb6f80000.

    Read at address  0x4A003730 (0xb6f80730): 0x000D0004

    The software is compiled from GLSDK 7.03.0.03.

    When it happens, the system is stalled with back trace as bellow

    [   26.659808] INFO: rcu_preempt self-detected stall on CPU { 0}  (t=2100 jiffies g=4294967222 c=4294967221 q=3582)

    [   26.670055] CPU: 0 PID: 1250 Comm: locationservice Tainted: P           O 3.14.57 #3

    [   26.677845] [<c0011415>] (unwind_backtrace) from [<c000ee9f>] (show_stack+0xb/0xc)

    [   26.685453] [<c000ee9f>] (show_stack) from [<c032247f>] (dump_stack+0x4b/0x80)

    [   26.692710] [<c032247f>] (dump_stack) from [<c0053165>] (rcu_check_callbacks+0x30d/0x618)

    [   26.700927] [<c0053165>] (rcu_check_callbacks) from [<c002c01f>] (update_process_times+0x2b/0x44)

    [   26.709840] [<c002c01f>] (update_process_times) from [<c0059cfb>] (tick_sched_timer+0x2b/0x4c)

    [   26.718491] [<c0059cfb>] (tick_sched_timer) from [<c0038641>] (__run_hrtimer+0x29/0x80)

    [   26.726532] [<c0038641>] (__run_hrtimer) from [<c0038d9b>] (hrtimer_interrupt+0xd7/0x20c)

    [   26.734749] [<c0038d9b>] (hrtimer_interrupt) from [<c02622a7>] (arch_timer_handler_virt+0x1f/0x24)

    [   26.743750] [<c02622a7>] (arch_timer_handler_virt) from [<c004eaf5>] (handle_percpu_devid_irq+0x39/0x48)

    [   26.753273] [<c004eaf5>] (handle_percpu_devid_irq) from [<c004c833>] (generic_handle_irq+0x13/0x1c)

    [   26.762360] [<c004c833>] (generic_handle_irq) from [<c000d4bb>] (handle_IRQ+0x23/0x60)

    [   26.770312] [<c000d4bb>] (handle_IRQ) from [<c00083c3>] (gic_handle_irq+0x1f/0x48)

    [   26.777916] [<c00083c3>] (gic_handle_irq) from [<c000f65b>] (__irq_svc+0x3b/0x80)

    [   26.785426] Exception stack(0xe89f9c30 to 0xe89f9c78)

    [   26.790498] 9c20:                                     00000000 c055fc00 0000000a 00000000

    [   26.798710] 9c40: 00000002 00000013 00000000 e89f8000 fa212000 00000000 000000b8 0000001a

    [   26.806920] 9c60: a2aaaaa3 e89f9c78 c0027def c0027dfe 40000133 ffffffff

    [   26.813565] [<c000f65b>] (__irq_svc) from [<c0027dfe>] (__do_softirq+0x66/0x17c)

    [   26.820992] [<c0027dfe>] (__do_softirq) from [<c00280e3>] (irq_exit+0x6f/0xa0)

    [   26.828245] [<c00280e3>] (irq_exit) from [<c000d4bf>] (handle_IRQ+0x27/0x60)

    [   26.835324] [<c000d4bf>] (handle_IRQ) from [<c00083c3>] (gic_handle_irq+0x1f/0x48)

    [   26.842927] [<c00083c3>] (gic_handle_irq) from [<c000f65b>] (__irq_svc+0x3b/0x80)

    [   26.850437] Exception stack(0xe89f9ce8 to 0xe89f9d30)

    [   26.855507] 9ce0:                   c0594224 00000010 0000000b 00000006 c0594224 e8b68eac

    [   26.863719] 9d00: ea773a10 0001c200 00000013 00000020 000000b8 0000001a 38dc6c00 e89f9d30

    [   26.871929] 9d20: c01d57e1 c0324e1e 00000033 ffffffff

    [   26.877005] [<c000f65b>] (__irq_svc) from [<c0324e1e>] (_raw_spin_unlock_irq+0x16/0x3c)

    [   26.885045] [<c0324e1e>] (_raw_spin_unlock_irq) from [<c01d57e1>] (omap_8250_set_termios+0x1c9/0x2ac)

    [   26.894305] [<c01d57e1>] (omap_8250_set_termios) from [<c01d3cd9>] (serial8250_set_termios+0x9/0x14)

    [   26.903480] [<c01d3cd9>] (serial8250_set_termios) from [<c01d103b>] (uart_change_speed+0x4b/0x5c)

    [   26.912392] [<c01d103b>] (uart_change_speed) from [<c01d10a3>] (uart_set_termios+0x57/0x180)

    [   26.920868] [<c01d10a3>] (uart_set_termios) from [<c01c4379>] (tty_set_termios+0x111/0x238)

    [   26.929255] [<c01c4379>] (tty_set_termios) from [<c01c481f>] (set_termios+0x18f/0x1b4)

    [   26.937204] [<c01c481f>] (set_termios) from [<c01c4ae7>] (tty_mode_ioctl+0x223/0x390)

    [   26.945067] [<c01c4ae7>] (tty_mode_ioctl) from [<c01c1071>] (tty_ioctl+0x1f1/0x878)

    [   26.952756] [<c01c1071>] (tty_ioctl) from [<c0096691>] (do_vfs_ioctl+0x59/0x404)

    [   26.960184] [<c0096691>] (do_vfs_ioctl) from [<c0096a7d>] (SyS_ioctl+0x41/0x4c)

    [   26.967524] [<c0096a7d>] (SyS_ioctl) from [<c000cc21>] (ret_fast_syscall+0x1/0x4e)

    It seems to be more likely happen in set_termios().

    I dump IIR and LSR register in 8250 chain IRQ hanlder.

    [    5.646708] serial8250: too much work for irq301

    [    5.651353] uart4: irq:514 status:e0 iir:c6 tx:0 rx:14 fe:1 pe:0 brk:0 oe:0

    [    5.659533] serial8250: too much work for irq301

    [    5.664172] uart4: irq:1028 status:e0 iir:c6 tx:0 rx:92 fe:2 pe:0 brk:0 oe:1

    [    5.672286] serial8250: too much work for irq301

    [    5.676924] uart4: irq:1542 status:e0 iir:c6 tx:0 rx:168 fe:2 pe:0 brk:0 oe:2

    [    5.685147] serial8250: too much work for irq301

    [    5.689782] uart4: irq:2056 status:e0 iir:c6 tx:0 rx:244 fe:2 pe:0 brk:0 oe:3

    [    5.698001] serial8250: too much work for irq301

    [    5.702636] uart4: irq:2570 status:e0 iir:c6 tx:0 rx:320 fe:2 pe:0 brk:0 oe:4

    UART5 trigger interrupt  ~50000 times per second. The IIR and LSR registers contents are weired to me.

    IIR = 0xc6 means RLSI (Receiver line status interrupt) which is interpreted as Receiver line status interrupt according to DRA7 TRM.

    LSR=0xe0 means FIFOE and TEMT and THRE.

    #define UART_LSR	5	/* In:  Line Status Register */
    #define UART_LSR_FIFOE		0x80 /* Fifo error */
    #define UART_LSR_TEMT		0x40 /* Transmitter empty */
    #define UART_LSR_THRE		0x20 /* Transmit-hold-register empty */
    #define UART_LSR_BI		0x10 /* Break interrupt indicator */
    #define UART_LSR_FE		0x08 /* Frame error indicator */
    #define UART_LSR_PE		0x04 /* Parity error indicator */
    #define UART_LSR_OE		0x02 /* Overrun error indicator */
    #define UART_LSR_DR		0x01 /* Receiver data ready */
    #define UART_LSR_BRK_ERROR_BITS	0x1E /* BI, FE, PE, OE bits */

    According to DRA7 TRM, FIFOE means at least one of Break interrupt, Frame error and parity error in RX FIFO. However  none of BRK_ERROR_BITS is set.

    I found related thread[1] about "too much irq" but still haven't any conclusion.

    Could you provide any advise?

    Thanks!

    [1]. www.gossamer-threads.com/.../2106110

  • Hi Totoro,
    I did not check your log in detail, but can you please read the post link from my last post?
    Then, can you try what Andy did in his last post?
    Thanks,
    Stan
  • Hi Stanislav;

    Forgive my thoughtless. I misunderstood and checked WAKEUPENABLE bit of  PADCONF instread of UART_SYSC.

    UART_SYSC is 0xD, same as Andy's.  It is handled by omap_hwmod completely.

    arch/arm/mach-omap2/omap_hwmod.c
    
    static void _enable_sysc(struct omap_hwmod *oh)
    {
    ......
    	if (sf & SYSC_HAS_SIDLEMODE) {
    		if (oh->flags & HWMOD_SWSUP_SIDLE ||
    		    oh->flags & HWMOD_SWSUP_SIDLE_ACT) {
    			idlemode = HWMOD_IDLEMODE_NO;
    		}
    ......
    

    I don't understand why it should be HWMOD_IDLEMODE_NO in case of HWMOD_SWSUP_SIDLE_ACT. I checked hwmod of omap3, omap4 and omap5, the flag of UART module is always HWMOD_SWSUP_SIDLE_ACT.

    Anyway, I will do some test with WAKEUP disabled.

    Regards,

    Totoro

  • Hi Stan,
    After several days testing , the way which turn off wakeup enable event is not working.


    Andy
  • Hi Andy,

    Would you explain how to turn off wakeup event?

    a.  Write UART_SYSC to clear ENAWAKEUP in 8250_omap.c driver

    Or

    b. remove SYSC_HAS_ENAWAKEUP from sysc_flags

    static struct omap_hwmod_class_sysconfig dra7xx_uart_sysc = {
    	.rev_offs	= 0x0050,
    	.sysc_offs	= 0x0054,
    	.syss_offs	= 0x0058,
    	.sysc_flags	= (SYSC_HAS_AUTOIDLE | SYSC_HAS_ENAWAKEUP |
    			   SYSC_HAS_SIDLEMODE | SYSC_HAS_SOFTRESET |
    			   SYSS_HAS_RESET_STATUS),
    	.idlemodes	= (SIDLE_FORCE | SIDLE_NO | SIDLE_SMART |
    			   SIDLE_SMART_WKUP),
    	.sysc_fields	= &omap_hwmod_sysc_type1,
    };

    Or

    c. clear UART_WER register of UART5

    	/* Enable module level wake up */
    	priv->wer = OMAP_UART_WER_MOD_WKUP;
    	if (priv->habit & OMAP_UART_WER_HAS_TX_WAKEUP)
    		priv->wer |= OMAP_UART_TX_WAKEUP_EN;
    	serial_out(up, UART_OMAP_WER, priv->wer);
    

    Regards

    Totoro

  • Hi Totoro,
    I use "method a" which use "omapconf write 0x48020054 0x9" command to clear ENAWAKEUP bit (UART_SYSC register ) in the init script.

    We use UART3 for our console , did your console also in UART3 ? Or the SDK default UART1 ?

    Please also share your test result by turn off ENAWAKEUP bit , thanks.

    Best,
    Andy
  • Hi Totoro,
    By the way , why you want to modify the UART_SYSC of UAR5 ? Becasue the irq301 issue happened on UART3 on our board.

    Or your irq301 is mapping to UART5 on your board ?

    Best,
    Andy
  • Hi Andy,

    The BSP I work on is GLSDK 7.03.0.03. And IRQ301 is linux IRQ number mapped to UART5 hardware IRQ on our board.
    We use UART3 as console too.

    I am not sure if "omapconf" way would work or not, since the UART_SYSC is handlered by omap_hwmod.c.

    Regards,
    Totoro
  • Hi Totoro,
    omapconf works for me , because the value does not change after booting couple hours since I change it to 0x9 in the init script.
    omap_hwmod.c should only effect the uart driver in the kernel booting. Am I right ?


    Best,
    Andy
  • Hi Stan,
    This issue still happens , do you have other way to debug ?

    Best,
    Andy
  • Hi Andy,
    Can you dump the following registers when issue shows up?
    UART_IIR
    UART_LSR
    UART_RXFIFO_LVL

    and these registers after driver loaded (can be done during normal operation):
    UART_SCR
    UART_IER
    UART_WER
    UART_IER2

    Can you tell me more about your use-case? Baud rate/stop bits/parity. How frequently you receive data, How freq you transmit data. What kind of data, etc?
  • Hi Andy,

    Very sorry for being busy in other stuffs last week. I will start booting test tomorrow with wake-up disabled.

    I wonder how many UART(ttySx) are used on your board? Is there any proccess opening any ttyS device when 8250 comlained about “too much work”?

    8250 driver handles irqs as chained irqs. suprious irq from any uart controller could be the cause of "too much work for irq301".

    Regard,

    Totoro

  • Hi Stanislav, Andy,

    I dumped SCR, IER, WER and IER2 at the end of set_termios as 

    SCR = 0x0; IER=0x5; WER=0xff; IER2=0.

    I tested with wakeup disabbled but still reproduced within 10hours. Patch as bellow

    --- a/arch/arm/mach-omap2/omap_hwmod_7xx_g5_data.c
    +++ b/arch/arm/mach-omap2/omap_hwmod_7xx_g5_data.c
    @@ -2473,7 +2473,7 @@ static struct omap_hwmod_class_sysconfig dra7xx_uart_sysc = {
     	.rev_offs	= 0x0050,
     	.sysc_offs	= 0x0054,
     	.syss_offs	= 0x0058,
    -	.sysc_flags	= (SYSC_HAS_AUTOIDLE | SYSC_HAS_ENAWAKEUP |
    +	.sysc_flags	= (SYSC_HAS_AUTOIDLE |
     			   SYSC_HAS_SIDLEMODE | SYSC_HAS_SOFTRESET |
     			   SYSS_HAS_RESET_STATUS),
     	.idlemodes	= (SIDLE_FORCE | SIDLE_NO | SIDLE_SMART |
    @@ -2524,7 +2524,7 @@ static struct omap_hwmod dra7xx_uart5_hwmod = {
     	.class		= &dra7xx_uart_hwmod_class,
     	.clkdm_name	= "l4per_clkdm",
     	.main_clk	= "uart5_gfclk_mux",
    -	.flags		= HWMOD_SWSUP_SIDLE_ACT,
    +	/*.flags		= HWMOD_SWSUP_SIDLE_ACT,*/
     	.prcm = {
     		.omap4 = {
     			.clkctrl_offs = DRA7XX_CM_L4PER_UART5_CLKCTRL_OFFSET,
    
    --- a/drivers/tty/serial/8250/8250_omap.c
    +++ b/drivers/tty/serial/8250/8250_omap.c
    @@ -581,10 +581,11 @@ static int omap_8250_startup(struct uart_port *port)
     	up->capabilities |= UART_CAP_RPM;
     #endif
     
    -	/* Enable module level wake up */
    -	priv->wer = OMAP_UART_WER_MOD_WKUP;
    -	if (priv->habit & OMAP_UART_WER_HAS_TX_WAKEUP)
    -		priv->wer |= OMAP_UART_TX_WAKEUP_EN;
    +	/* disable module level wake up */
    +	/*priv->wer = OMAP_UART_WER_MOD_WKUP;*/
    +	/*if (priv->habit & OMAP_UART_WER_HAS_TX_WAKEUP)*/
    +		/*priv->wer |= OMAP_UART_TX_WAKEUP_EN;*/
    +	priv->wer = 0;
     	serial_out(up, UART_OMAP_WER, priv->wer);
     
     	if (up->dma)

    Regards,
    Totoro
  • Hi Totoro,
    You said previously:
    "LSR=0xe0 means FIFOE and TEMT and THRE."
    1. But how many times you are reading LSR? Take into account that single read from LSR will end the OVERRUN interrupt condition and you will not be able to detect it at next LSR read.
    2. Also, for framing error (FE,PE,BI) make sure you are not reading RHR register beforehand, as read from RHR will end the interrupt condition.

    Besides that, I agree that LSR[7] FIFO ERROR is strange to be set without the 4 FIFO error bits. This might be a HW bug but I cannot find this documented in errata.
    For sure I know that, if LSR[7] is always set, it will generate endless interrupt request, which can cause the "too much work..."
    Now we must find out what causes LSR[7] to be set and why it is not cleared.
    Can you check the 1. and 2. conditions above?

    Thanks,
    Stan

  • Hi Stan;

    Forgot to mention that I had not read and printed LSR and IIR directly.

    Instead RHR, IIR and LSR are handled by Linux 8250 driver. I just add  shadows into struct uart_8250_port. Every time 8250 handles IRQ, I save IIR and LSR to the shadows. When "too much work for irqxxx" happens, the shadows will be printed to console.

    Is that OK?

    Regards,

    Totoro

  • Hi Tororo,
    I replace kernel in www.omappedia.org/.../2014LTS_PostRelease_updates .
    After testing for several days, I did not see the issue happen again.
    Maybe you can try it.

    Best,
    Andy
  • Hi Andy:

    Thanks your feedback!

    I check 8250_omap.c of PostRelease_update. The driver had been reworked to use  per-port IRQ handler.

    9e91597f24234062c8bb4278ba7c6197be84e668. "serial: 8250_omap: provide complete custom startup & shutdown callbacks"

    @@ -581,10 +605,31 @@ static int omap_8250_startup(struct uart_port *port)
     ...
    +
    +       ret = request_irq(port->irq, omap8250_irq, IRQF_SHARED,
    +                         dev_name(port->dev), port);
    +static irqreturn_t omap8250_irq(int irq, void *dev_id)
    +{
    +       struct uart_port *port = dev_id;
    +       struct uart_8250_port *up = up_to_u8250p(port);
    +       unsigned int iir;
    +       int ret;
    +
    +#ifdef CONFIG_SERIAL_8250_DMA
    +       if (up->dma) {
    +               ret = omap_8250_dma_handle_irq(port);
    +               return IRQ_RETVAL(ret);
    +       }
    +#endif
    +
    +       serial8250_rpm_get(up);
    +       iir = serial_port_in(port, UART_IIR);
    +       ret = serial8250_handle_irq(port, iir);
    +       serial8250_rpm_put(up);
    +
    +       return IRQ_RETVAL(ret);
    +}

    When uart irq fires, omap_irq handler is invoked and it calls  serial8250_handle_irq() to hadler tx/rx.

    However, "Too much work for irqxxx" is printed in serial8250_interrupt ( drivers/tty/serial/8250/8250_core.c).

    So even there is UART irq storm in yout board with PostRelease GLSDK, "too much work for xxx" would never be printed.

    static irqreturn_t serial8250_interrupt(int irq, void *dev_id)
    {
    	struct irq_info *i = dev_id;
    	struct list_head *l, *end = NULL;
    	int pass_counter = 0, handled = 0;
    
    	DEBUG_INTR("serial8250_interrupt(%d)...", irq);
    
    	spin_lock(&i->lock);
    
    	l = i->head;
    	do {
    		struct uart_8250_port *up;
    		struct uart_port *port;
    
    		up = list_entry(l, struct uart_8250_port, list);
    		port = &up->port;
    
    		if (port->handle_irq(port)) {
    			handled = 1;
    			end = NULL;
    		} else if (end == NULL)
    			end = l;
    
    		l = l->next;
    
    		if (l == i->head && pass_counter++ > PASS_LIMIT) {
    			/* If we hit this, we're dead. */
    			printk_ratelimited(KERN_ERR
    				"serial8250: too much work for irq%d\n", irq);
    			break;
    		}
    	} while (l != end);
    
    	spin_unlock(&i->lock);
    
    	DEBUG_INTR("end.\n");
    
    	return IRQ_RETVAL(handled);
    }
    

    This is a chained IRQ handler provided by 8250 core. It polls all of the IRQ if any IRQ fires. For example:

    Three ports are opened, UART1(irq300), UART2(irq301) and UART3(irq302), then irq300 301 and 302 are chained together. If irq300 fired,  serial8250_interrupt will handle irq300 first and then it polls status of irq301, irq302 even they are idle. the it checks irq300  again and return if irq300 is idle. However if irq300 is fired again, then serial8250_interrupt  will loop in the irq chain again and again.

    Regarding omap platform, the irq chain is unreasonable because each irq line is dedicated to a uart port.

    GLSDK post release might do fix "too much work" problem since it provide custom startup routine and irq handler. It remove tricky register read/write test that 8250 core does.

    Regards.

    Totoro

  • Hi Totoro,
    Thanks for your explanation.
    But which tag/branch did you find out the "serial: 8250_omap: provide complete custom startup & shutdown callbacks" commit ?
    I can not find that commit in the "glsdk-7.04.00.03.01" tag .



    Best,
    Andy
  • Hi Andy,

    Sorry I checkouted a wrong revision of GLSDK.

    Per-port irq handler has not been implemented in the GLSDK 7.04 POST release.

    I do a quick diff comparision of glsdk-7.04.00.03.01 and glsdk-7.04.00.03 but do not found any suspicious patch related with 8250 serial.

    It takes too much effort to upgrade glsdk from 7.03 to 7.04 for me. I will give it a try once time slot is available.

    BRs,
    Totoro