Tool/software: Linux
I am using the Sitara AM3354 as a USB device connected to an x86 host with the Linux USB CDC-EEM (Ethernet-over-USB) device stack. When doing an nmap port scan of the Sitara USB device endpoint from the x86 host, the Sitara will frequently report "RT throttling activated" and will sometimes watchdog reset.
I am speculating that the volume of USB interrupts generated by the nmap port scan prevents daemons running on the Sitara from periodically updating their "kick" files, and the watchdog daemon detects this and resets the Sitara. Here is an example where the sensor monitoring daemon (sensord) was unable to "touch" its sensord.kick file for 5 seconds and the watchdog daemon let the hardware watchdog expire to reset the system.
Dec 5 18:20:41:Jan 1 00:00:54 127.4.2.2 err watchdog[1247]: file /var/run/sensord.kick was not changed in 5 seconds.
Dec 5 18:20:41:Jan 1 00:00:55 127.4.2.2 warning kernel: [ 59.967841] [sched_delayed] sched: RT throttling activated
Dec 5 18:20:41:Jan 1 00:00:55 127.4.2.2 err watchdog[1247]: repair binary /usr/sbin/capture_logs_before_shutdown.sh returned 255
Dec 5 18:20:41:Jan 1 00:00:55 127.4.2.2 alert watchdog[1247]: shutting down the system because of error 255
Dec 5 18:20:51:Jan 1 00:01:05 127.4.2.2 notice watchdog[1247]: hardware watchdog enabled, let it expire.
Dec 5 18:20:51:Jan 1 00:01:05 127.4.2.2 alert watchdog[1247]: watchdog set to 1 second
The issue seems dependent on the rate of port scanning done by nmap. Limiting the nmap scan rate to 10000 packets/second can avoid this issue - but I'd like to find a solution on the Sitara side. This is a potential denial-of-service vulnerability in our system if the x86 can cause the Sitara to watchdog reset.
I ran a test that checked interrupt counters while doing nmap port scan.
while [ 1 ]; do echo "---"; date; cat /proc/interrupts; echo; done
In a period of 6 seconds, the DMA and USB0 interrupts had significant increases. All other interrupts had small increases.
That's nearly 11,000 interrupts/second, or 1 interrupt every 92 us.
I enabled tracing in the kernel to try to determine the amount of time that is spent servicing the USB interrupts.
sitara# echo omap3_intc_handle_irq > set_ftrace_filter
sitara# echo nop > current_tracer
sitara# echo 1 > function_profile_enabled
<Run nmap scan on x86 host>
sitara# echo 0 > function_profile_enabled
sitara# cat trace_stat/function0
Function Hit Time Avg s^2
-------- --- ---- --- ---
omap3_intc_handle_irq 115547 9388735 us 81.254 us 3963830 us
I know I am getting an interrupt about every 92us (without kernel tracing), and it is taking on average 81us to service that interrupt. That doesn't leave much time to do anything useful!
Do these numbers seem reasonable - an interrupt every 92us and taking 81us to service a USB interrupt?
What can be done on the Sitara to reduce the impact of all of these USB interrupts?
Is there a way to limit the bandwidth of the USB interface, to reduce the frequency of the interrupts?
Thanks for any help,
Gregg.