Skip to main content

Audio Codec

 

After a tiring and hectic week, when you sit to binge-watch your favorite web series or a movie, or while listening to tracks of your liking which makes your day so relived, ever wondered about the tech behind the audio. Yes! Audio, not the visual aspect.




If you ever gave it a thought and wanted to explore what it is, this is your place to be. Not only we are covering the basics of Audio Codecs but also its various types which have evolved over the years.

Most conversations related to streaming automatically jump to the visual aspects. We all generally pay attention to the crisper photo quality and talk only on pixels. No one ever talks about the audio aspect. Even though it goes unnoticed, along with video, audio is equally important for a better experience. 

 Audio Codec:

The term codec is a portmanteau that combines the words “coder” and “decoder.” A codec is a standard or tool for encoding and decoding multimedia files.

“RAW” or uncompressed audio files are recorded using techniques that capture as much data as possible. This provides very high quality but results in very large file sizes that aren’t practical for live streaming.

To make audio files smaller and easier to distribute, we use a codec.

The first thing a codec does is encode an audio file. This encoding involves tossing out extra information to reduce file sizes while maintaining as much quality as possible. This process involves a sequence of complex mathematical functions.

The second role of a codec is decoding, which is essentially playing back an audio file that has previously been encoded. To make a complex process very simple, this means reversing the math done during the encoding step.

In short, an audio codec is a protocol for compressing digital audio to save space and for playing back with the video.

Audio Codec can be a hardware circuit, or a software program implemented using a certain algorithm.

Hardware Audio Codecs:

In hardware, an audio codec refers to a single device that encodes analog audio as digital signals and decodes digital back into analog. In other words, it contains both an analog-to-digital converter (ADC) and a digital-to-analog converter (DAC) running off the same clock signal. This is used in sound cards that support both audios in and out, for instance. Hardware audio codecs send and receive digital data using buses such as AC-Link, I²S, SPI, I²C, etc. Most commonly the digital data is linear PCM, Linear pulse-code modulation is a specific type of PCM in which the quantization levels are linearly uniform. This contrasts with PCM encoding in which quantization levels vary as a function of amplitude and this is the only format that most codecs support, but some legacy codecs support other formats such as G.711 for telephony.


4-bit LPCM





Sound Card

Software Codecs:

In software, an audio codec is a computer program implementing an algorithm that compresses and decompresses digital audio data according to a given audio file or streaming media audio coding format. The objective of the algorithm is to represent the high-fidelity audio signal with a minimum number of bits while retaining quality. This can effectively reduce the storage space and the bandwidth required for transmission of the stored audio file. 

Most modern audio compression algorithms are based on:

1.     Modified discrete cosine transform (MDCT):

The modified discrete cosine transform (MDCT) is a lapped transform based on the type-IV discrete cosine transform, with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to the energy-compaction qualities of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid artifacts stemming from the block boundaries, stemming from the block boundaries. As a result of these advantages, the MDCT is the most widely used lossy compression technique in audio data compression.  

2.     Linear predictive coding:

Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. It is one of the most powerful speech analysis techniques, and one of the most useful methods for encoding good quality speech at a low bit rate, and provides highly accurate estimates of speech parameters. LPC is the most widely used method in speech coding and speech synthesis.

Common Software Audio Codecs:

There is a wide range of audio codecs available today. However, not all audio codecs are equally supported.

Some devices may support one audio codec, but not another. Some provide better quality, while others focus on compression above all else.

These are important considerations when it comes to deciding on the best audio codec for a given situation. Let’s go over a few of the most common and best audio codecs.

1. MP3:

The most well-known audio format is probably MP3, which is technically called MPEG-2 Audio Layer III.

Originally introduced in the 1990s, MP3 revolutionized digital audio. Files were much smaller than the previous formats, allowing them to be streamed and downloaded over the internet.

MP3 also helped push the era of portable digital music past the CD era by enabling iPods and other early “MP3 players.” It is still widely used today.

2. AAC:

Developed a few years after MP3, AAC built on the success of that format but increased compression efficiency.

AAC generally provides better audio quality at the same bitrate as MP3 or comparable quality at lower bitrates.

AAC has been upgraded several times. The latest version of the standard is HE-AAC. It is a closed source format but is probably the most widely used audio codec on the internet today. It is supported by most video streaming platforms. 

3. WAV (LPCM)

WAV, which is short for “Waveform Audio File Format,” was originally released more than 25 years ago.

It is known to be primarily used on Windows computers to store uncompressed audio in the LPCM format.

4. AIFF:

AIFF is a Mac format that’s similar to WAV. It stores uncompressed audio using the PCM (Pulse-Code Modulation). 

Like WAV, AIFF files are very large—around 10 MB for one minute of a standard audio recording.

5. WMA:

Another codec on the market, albeit one that is becoming less common, is WMA—Windows Media Audio. This codec was developed as an alternative to MP3 but has become somewhat of a niche product. 

6. Opus:

The final audio codec we’ll take a look at is Opus. Opus isn’t in wide use yet, but it’s considered a next-generation codec. It provides higher audio quality at all bit rates compared to every other codec listed here. Opus also has the added advantage of being royalty-free and open source.

Both iOS and Android now natively support Opus playback. We’ll likely see Opus getting wider use in the future.

   
Quick Comparison

Best Audio Codec:

We believe that AAC is the best audio codec for most situations. AAC is supported by a wide range of devices and software platforms, including iOS, Android, macOS, Windows, and Linux. Other devices such as Smart TVs and set-top boxes also support AAC.

Besides wide support, AAC also has the advantage of better audio quality compared to MP3. Blind listening tests generally show that AAC is the best codec available for general use. 

This may change in the future as Opus becomes more broadly supported. However, hardware and software changes move slowly. That day is likely still a few years away.

For internet video, AAC is the best audio codec for live streaming as well as video on demand. This is generally configured via settings in your hardware and software encoder.

Comments