Some time ago, I read the statement that the sample rate of a DAC is set by the audio file that you play. It then occurred to me, that this is – sort of – wrong. Let me explain why, using a turntable for comparison.
The turntable speed is set by you, after you looked at the label on the record (or the size..). You can put on a single, and play that “file” at LP speed. With digital audio, it’s the same. Set the DAC to 44.1 sample rate, and feed it a 96 kHz file. The DAC will process 44.100 samples per second, thus taking 2 seconds to play one second of the 96 kHz file. There’s one big difference compared to the turntable: the 96 kHz file identifies itself as such, forcing the DAC to switch to the corresponding sample rate. It’s like an automatic turntable, that detects a single and switches to 45 RPM all by it self.
There might be a (very) small difference in speed between DAC and file sample rate. With the turntable, you can correct this, remember the stroboscope? With digital audio, the DAC might run at 44.102 kHz, playing a 44.100 kHz file. It’s unlikely that you will hear this. Note that when using S/PDIF style audio streams, the DAC follows the stream embedded clock. But somewhere this S/PDIF stream is created. And there the DAC sample rate is set.
When Logitech Media Server is playing a file, Squeezelite receives a TCP data stream, and sends it to the ALSA interface that reads the samples, all clocked by the audio hardware. The timing of the TCP stream does not impact audio playback, only the ALSA driver and audio hardware.
I found it hard to find the actual sample rate of my USB audio interface. The below command shows the current audio interface setting. On the turntable, this would be the speed switch, set to 45 RPM for example. This does not mean that the record is playing at exactly 45 RPM.
debian@beaglebone:/proc/asound/card1/pcm0p/sub0$ cat hw_params access: MMAP_INTERLEAVED format: S32_LE subformat: STD channels: 2 rate: 44100 (44100/1) period_size: 441 buffer_size: 1764
The command below shows the actual sample rate that the USB audio interface reports back to the ALSA driver. More precise: “Every few milliseconds the average sample rate over the last period is reported back as a 16.16 bit fixed point number”, as described in detail here. In our case, the interface is 2 Hz slower, measured against the PC clock. Like the stroboscope on the turntable, that measures the platter speed against 50Hz mains frequency, is turning left (or right ?).
debian@beaglebone:/proc/asound/card1$ cat stream0 Singxer USB Audio 2.0(SU1) at usb-musb-hdrc.1-1, high speed : USB Audio Playback: Status: Running Interface = 1 Altset = 1 Packet Size = 72 Momentary freq = 44098 Hz (0x5.8320) Feedback Format = 16.16
USB audio data is transferred using USB isochronous transfer mode. Each frame contains 1024 bytes and a CRC checksum. There is no resend mechanism. So, data corrupted, frame dropped. Yes, bits are bits and USB cable quality has no impact on audio. Sure… so then why CRC and resend exists? Because bits are “implemented” as analog voltages, susceptible to noise and bad contacts, just like any other analog signal. TCP has CRC and resends, SATA has CRC and recovery mechanisms as well. But for real-time applications, resend is considered a bad thing.
How about internet radio streams? Well, let’s take the sample rate of 44.098 kHz. This translates to about 4 seconds per 100.000 seconds error. Or roughly 1 second per day. Assuming a 3 second buffer for the stream data, the buffer runs out after 3 days streaming. Then you will get a “tick” and the stream is aligned again. Not really an issue for most. The only way to avoid this, is sample rate conversion, correcting the 40ppm error. Good that Squeezelite does not do SRC..