Advertisement

Ethernet AVB, USB Audio Class 2.0 aid audio qualityEthernet AVB, USB Audio Class 2.0 aid audio quality

Ethernet AVB, USB Audio Class 2.0 aid audio qualityEthernet AVB, USB Audio Class 2.0 aid audio quality

New technology aids audio transport

BY ALI DIXON
Director, Product Management
XMOS
www.xmos.com

Over the last decade the way we find, store, and listen to music has undergone a huge change. For most people, long gone are the Saturday mornings browsing the CD racks (or record stores, if we go further back). Of course, many refuse to change, for a variety of reasons. Some like to look at their music collection on their living room shelves, others love having the CD case for the art and information.

Online media has changed many aspects of the audio industry, but some things just don’t change. The debate over audio quality still goes on, some people prefer vinyl for its warmth, some people like CDs, and many are happy to listen to compressed formats such as AAC (Advanced Audio Coding) as found in .m4a files. This may be the successor to the MP3 format (not to be confused with MPEG-3 or MPEG-4, which are video/audio formats).

Fig. 1: Ethernet AVB fits low-cost, small-form-factor products such as this microphone.

The overall trend is that music no longer lives on shelves or in CD racks, but in hard drives in home computers, and increasingly in the cloud. This brings about its own unique problems, not in the encoding system used, or the storage technology, but in distributing the audio from the storage media to the speakers.

Some facets of audio quality

CDs are sampled at 44.1 kHz using 16-bit samples. 44.1 K x 16 = 705.6 kbits/s (two tracks = 1,400 kbits/s total). CDs are often considered as a benchmark for audio quality, but increasingly music producers and consumers are turning to new technologies to deliver higher performance. Many users are looking to higher sample frequencies to improve audio fidelity. Most pro audio equipment now supports sample rates of 96 kHz, and some even support 192 kHz. Blue-ray, DVD, and HD DVD disks all have the capacity for audio to be distributed at these higher sample frequencies, offering higher quality audio than CDs.

An hour of CD-quality audio in uncompressed WAV format takes 630 Mbytes. If you double or triple the sample rate, to get audiophile quality, you end up with 1.3- and 1.9-Gbyte files. This is OK for the audiophile applications, but not very workable for portable audio players and such.

The common way to transfer music via the Internet is using MP3 files. MP3 files are compressed and compression usually leads to degradation in audio quality. You can select the amount of compression of the MP3 file when it is created. Depending on that selection, it may come out at 64, 128, 192, 256, or 320 kbits/s — as compared to the 1,400 kbits/s for uncompressed WAV files. So, at 128 kbits/s, a one hour file is only 57.6 Mbytes. That’s much quicker to download to my music player. At least 128-bit/s MP3 is needed for any high-quality content, and a true audiophile will likely say no MP3 file, at any bit rate, provides high-quality audio.

AAC stands for advanced audio coding and is the logical successor to MP3 for audio coding at medium to high bit rates. It is a compressed audio format similar to MP3, but offers several performance improvements, including a higher coding efficiency for both stationary and transient signals, a simpler filter bank, and better handling of frequencies above 16 kHz. This format maintains quality much higher than MP3.

High-quality audio probably demands data rates of at least 128 kbits/s in AAC format. AAC at 256 kbits/s is the default encoding used by Apple iTunes and the iTunes Music Store. Increasing demand for higher quality audio formats is leading to common use of alternative audio encodings.

FLAC (Free Lossless Audio Codec) is a codec format increasingly used by audio enthusiasts. It allows digital audio to be losslessly compressed, achieving compression rates of 30% to 50% for most music, and significantly greater for a simple voice recording. There are a number of other audio codec “standards” around.

Dynamic range

Dynamic range is another key measure of audio quality, referring to the range between the quietest and loudest signals. CDs use 16-bit data — each audio sample getting 96 dB of dynamic range. The dynamic range of human hearing is about 140 dB, so the use of 24-bit (144 dB) data is increasing, especially in recording and mastering. These factors drive the requirement for external devices to enable recording and playback at higher sample frequencies and bit depths, and a reliable mechanism for connecting the devices to the computer.

However, computer hardware rarely takes advantage of higher sample frequencies, leading to significant growth in the computer accessories market for products such as USB digital audio converters.

USB audio

USB is ubiquitous and well suited to audio. USB 2.0 High Speed and USB Audio Class 2.0 (a driver for any USB 2.0 port) include a number of features to ensure the audio is transferred reliably and with high quality:

High sample frequencies: USB Audio Class 2.0 enables 192-kHz, and even 384-kHz, 24-bit audio.
Low latency: Latency is particularly important when interaction is involved. If a singer sings into a digital microphone and hears himself or herself singing through monitor speakers, delay as little as 10 ms can be heard.
Asynchronous clocking: In some devices, the PC can be the master of the audio clock, which is used by the D/A converters. Keeping the audio clocks local to the converters, and having the PC synchronize to this local clock helps to minimize jitter.
Many audio channels: Ideal for surround sound and for professional applications where recording multiple audio channels is required.

Ethernet audio/video

Performance requirements for audio distribution can be more important in live sound and professional applications. A common example is standing in an airport listening to announcements, and not being able to hear a thing due to the delay between the various speakers. In a recording studio, the requirements are even more demanding. Recalling the example of the singer, a delay between singing words and hearing them sets a very stringent requirement on a network. Add to that the reliability requirements of a recording studio, where a single dropped sample can ruin a recording — the network must deliver performance guarantees.

Ethernet Audio Video Bridging addresses this problem by providing reliable transport of audio across Ethernet networks. AVB is a collection of IEEE standards which augment Ethernet to provide functionality needed for AV distribution in professional and consumer applications, and is even finding its way into automotive.

Benefits include:

Reservation of bandwidth: Ensures the network has sufficient bandwidth to deliver audio from point A to point B. The network guarantees that the bandwidth will be available, and prevents other traffic from using more bandwidth than they subscribed for.
Connection management: Detects AVB endpoints on the network, to facilitate AV network setup.
Clock synchronization: Allows endpoints to synchronize, ensuring multiple speakers play the same audio at precisely the same time.
Low latency: AVB guarantees 2-ms latency between endpoints in the network. This performance level allows AV content to be distributed across large networks.

Ethernet AVB endpoints can be extremely lightweight, using embedded microcontrollers to provide AVB connectivity. Low-cost devices allow its use in cost-sensitive, small-form-factor applications from mixing desks to microphones and stereos to speakers.

Audio transport flexibility

Due to the large variety of standards, interfaces, and formats used within audio applications, technology to support audio connectivity often requires flexibility. ICs provided for this task should be programmable to allow the user to define functionality. AVB, USB, I2 S, SPDIF, and other interfaces can be selected for easy customization.

The 500-MIPS eight-thread XS1-L1 processor from XMOS can be used for USB Audio 2.0. XMOS also has a reference design using the low-cost XMOS XS1-L2 processor as the basis for running a software-only implementation of an Ethernet AVB audio endpoint, capable of both talker and listener modes, running up to eight duplex audio channels. It integrates the full range of AVB protocols, digital audio interfaces, and control software in a single board with a range of analog and digital I/O. Since the XMOS devices are programmable, they allow the user to define the functionality of the device. ■

Advertisement



Learn more about XMOS

Leave a Reply