This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM3354: USB interrupts causing RT throttling and WD reset

Part Number: AM3354

Tool/software: Linux

I am using the Sitara AM3354 as a USB device connected to an x86 host with the Linux USB CDC-EEM (Ethernet-over-USB) device stack. When doing an nmap port scan of the Sitara USB device endpoint from the x86 host, the Sitara will frequently report "RT throttling activated" and will sometimes watchdog reset.

I am speculating that the volume of USB interrupts generated by the nmap port scan prevents daemons running on the Sitara from periodically updating their "kick" files, and the watchdog daemon detects this and resets the Sitara. Here is an example where the sensor monitoring daemon (sensord) was unable to "touch" its sensord.kick file for 5 seconds and the watchdog daemon let the hardware watchdog expire to reset the system.

Dec 5 18:20:41:Jan 1 00:00:54 127.4.2.2 err watchdog[1247]: file /var/run/sensord.kick was not changed in 5 seconds.
Dec 5 18:20:41:Jan 1 00:00:55 127.4.2.2 warning kernel: [ 59.967841] [sched_delayed] sched: RT throttling activated
Dec 5 18:20:41:Jan 1 00:00:55 127.4.2.2 err watchdog[1247]: repair binary /usr/sbin/capture_logs_before_shutdown.sh returned 255
Dec 5 18:20:41:Jan 1 00:00:55 127.4.2.2 alert watchdog[1247]: shutting down the system because of error 255
Dec 5 18:20:51:Jan 1 00:01:05 127.4.2.2 notice watchdog[1247]: hardware watchdog enabled, let it expire.
Dec 5 18:20:51:Jan 1 00:01:05 127.4.2.2 alert watchdog[1247]: watchdog set to 1 second

The issue seems dependent on the rate of port scanning done by nmap. Limiting the nmap scan rate to 10000 packets/second can avoid this issue - but I'd like to find a solution on the Sitara side. This is a potential denial-of-service vulnerability in our system if the x86 can cause the Sitara to watchdog reset.

I ran a test that checked interrupt counters while doing nmap port scan.

while [ 1 ]; do echo "---"; date; cat /proc/interrupts; echo; done

---
Sat Jan  1 00:01:29 UTC 2000
           CPU0       
 28:         69      INTC  12  edma
 30:          0      INTC  14  edma_error
 32:          0      INTC  16  TI-am335x-adc
 33:       1411      INTC  17  47400000.dma-controller
 34:       1575      INTC  18  musb-hdrc.0.auto
 35:          0      INTC  19  musb-hdrc.1.auto
 46:        638      INTC  30  4819c000.i2c
 52:       7559      INTC  36  tilcdc
 56:          0      INTC  40  4a100000.ethernet
 57:      13706      INTC  41  4a100000.ethernet
 58:       5675      INTC  42  4a100000.ethernet
 59:          0      INTC  43  4a100000.ethernet
 84:      10238      INTC  68  gp_timer
 86:       1026      INTC  70  44e0b000.i2c
 87:         78      INTC  71  4802a000.i2c
 88:       2180      INTC  72  OMAP UART0
 91:          0      INTC  75  rtc0
 92:          0      INTC  76  rtc0
 93:          0      INTC  77  wkup_m3
 94:          1      INTC  78  wkup_m3_txev
125:          0      INTC 109  53100000.sham
127:          0      INTC 111  48310000.rng
164:          2  44e07000.gpio  20  atmel_mxt_ts
Err:          0
[  115.967766] [sched_delayed] sched: RT throttling activated
---
Sat Jan  1 00:01:35 UTC 2000
           CPU0       
 28:         69      INTC  12  edma
 30:          0      INTC  14  edma_error
 32:          0      INTC  16  TI-am335x-adc
 33:      66962      INTC  17  47400000.dma-controller
 34:      67132      INTC  18  musb-hdrc.0.auto
 35:          0      INTC  19  musb-hdrc.1.auto
 46:        638      INTC  30  4819c000.i2c
 52:       8087      INTC  36  tilcdc
 56:          0      INTC  40  4a100000.ethernet
 57:      13748      INTC  41  4a100000.ethernet
 58:       5696      INTC  42  4a100000.ethernet
 59:          0      INTC  43  4a100000.ethernet
 84:      10887      INTC  68  gp_timer
 86:       1031      INTC  70  44e0b000.i2c
 87:         78      INTC  71  4802a000.i2c
 88:       2465      INTC  72  OMAP UART0
 91:          0      INTC  75  rtc0
 92:          0      INTC  76  rtc0
 93:          0      INTC  77  wkup_m3
 94:          1      INTC  78  wkup_m3_txev
125:          0      INTC 109  53100000.sham
127:          0      INTC 111  48310000.rng
164:          2  44e07000.gpio  20  atmel_mxt_ts
Err:          0

In a period of 6 seconds, the DMA and USB0 interrupts had significant increases. All other interrupts had small increases.

33:       1411      INTC  17  47400000.dma-controller
33:      66962      INTC  17  47400000.dma-controller
+65551 interrupts
34:       1575      INTC  18  musb-hdrc.0.auto
34:      67132      INTC  18  musb-hdrc.0.auto
+65557 interrupts

That's nearly 11,000 interrupts/second, or 1 interrupt every 92 us.

I enabled tracing in the kernel to try to determine the amount of time that is spent servicing the USB interrupts.

sitara# echo omap3_intc_handle_irq > set_ftrace_filter
sitara# echo nop > current_tracer
sitara# echo 1 > function_profile_enabled

<Run nmap scan on x86 host>

sitara# echo 0 > function_profile_enabled
sitara# cat trace_stat/function0
Function Hit Time Avg s^2
-------- --- ---- --- ---
omap3_intc_handle_irq 115547 9388735 us 81.254 us 3963830 us

I know I am getting an interrupt about every 92us (without kernel tracing), and it is taking on average 81us to service that interrupt. That doesn't leave much time to do anything useful!

Do these numbers seem reasonable - an interrupt every 92us and taking 81us to service a USB interrupt?
What can be done on the Sitara to reduce the impact of all of these USB interrupts?
Is there a way to limit the bandwidth of the USB interface, to reduce the frequency of the interrupts?

Thanks for any help,
Gregg.