Dear gentlemen (any ladies around?)
I work as a tech support with a specialty distributor of industrial gadgets. I'm trying to help an integrator customer, using some callcenter headsets with a USB interface, containing the PCM2912A. The headset is an off-the-shelf product by TIPRO. The contact at my customer company is a programmer with quite some background in small-signal electronics, I'm just scratching the surface of Linux guts and basic electronics.
Long story short: under some circumstances, with varying period of occurrence (about 1-2 days on some sites, maybe a week on others), the PCM2912A seems to send garbage instead of some samples in the data stream. The ADC sampling rate is set to 48 kHz. Upon closer inspection, it seems that every 1 ms (= every 48 samples in the resulting audio recording), a couple of samples are garbled. In a particular example (see URL below), it was 6 samples every 1 ms, but reportedly on another occasion it was 2 samples every 1 ms. Interestingly, the garbage is not completely random. Upon closer inspection, the data in the "garbage" sequences is actually valid sample data, taken from an offset 256 samples (512B) in the past. 6 samples every 1 ms, taken 256 samples in the past. As 256 is not divisible by 48 (off by -16/+32 samples), data from the garbled positions is lost. Upon a very close inspection, on the first garbled sample, possibly the MSB comes from the correct sample, combined with LSB from the garbled/shifted sample (or completely hallucinogenic).
I haven't found a way to attach files in this forum. Here is a link to a raw audio file and two commented audacity screenshots:
e-shop.fccps.cz/.../distorted.zip
The problem occurs "at random" (= we do not know the trigger, nor can we reproduce this in a lab yet) and doesn't vanish all by itself - the only way out is to re-start the transfer (close and reopen the data stream).
We've tried to gather some information about the USB async isochronous transfers... as far as we can tell, 1 ms is the nominal "SOF clock rate" for full-speed USB, which should effectively correspond to the rate of IN transactions in insochronous transfers. At 48kSps, each IN payload packet (URB) contains 48 samples (theoretically give or take 1 sample for a marginal clock rate mismatch). The shift of 256 samples seems fairly characteristic, but doesn't "fit in the nominal USB clockwork" - and, as it's applied only to the payload (rather than the whole USB packets), it would seem to come from the 2912A. We're wondering if perhaps the host goes "out to lunch" past a certain margin, misses a couple "bus turn-around periods" (N times 1 ms) = fails to send an IN request for a few periods, and the 2912A goofs up - has nowhere to send the data, and possibly fails to move the "read pointer" forward (in the absence of IN frames for its endpoint, it fails to act on just the SOF frames, generated autonomously in precise 1ms spacing by the HCI hardware). This potential explanation is just a theory and is possibly imprecise or just plain wrong. We haven't yet captured a "now it happend" moment, as the occurrence is relatively rare, on a production system, and difficult to reproduce.
We're considering some hacks to the Linux kernel (USB audio drivers) to artificially introduce a small "hiccup" in the stream of IN requests, to try and see what happens. If this turns out to be a way to reproduce the problem, we could focus on mitigating "out to lunch" latencies in the host PC (scheduling priorities, SMI suppression, RT patches, compile-time preemption settings and some such - people around RTAI know better).
The callcenter application involves running the USB-attached microphones for long periods of time, 24/7. The USB transfer doesn't ever get turned off, except for system maintenance. The host PC is running Linux 3.10.17 with ALSA 1.0.27.1. This host PC is effectively an "audio grabbing front end", sending the data further over a network. My customer's software talks straight to the raw ALSA device in user space - there's no PulseAudio or Jack framework at work. The point is that the problem can be observed already at this grabbing PC, by watching the URB's coming from the low-level UHCI/EHCI drivers. Unfortunately we don't have a proper USB HW analyzer, and therefore we have to watch the URB's in software (Wireshark with a patched libpcap with USB support). Wireshark traces are available from us upon request.
Any comments or ideas are welcome...
While doing my homework (gathering knowledge about the USB async isochronous clockwork), I found a beautiful report (memoire) by Hitoshi Kondoh of Burr Brown (now TI) giving some context of how this USB audio product family got started. Very nice reading, explains some concepts... unfortunately not very relevant for the async transfer sub-mode in the "IN" direction and not a hint about the buffering details, but still pretty dense material.
www.thewelltemperedcomputer.com/.../Hitoshi Kondoh story.pdf
I have also noticed the "UAA guidelines" by Microsoft (from 2006), claiming that the async mode should not work at all on w2k3 or older, which would include XP SP3, where we did use the TIPRO headset not exactly "just fine", but it did work with a marginal quirk... (which we attributed to broken XP drivers for the EHCI/UHCI).
download.microsoft.com/.../uaa_guidelines.doc
Frank Rysanek