After a tiring and hectic week, when you sit to binge-watch your
favorite web series or a movie, or while listening to tracks of your liking
which makes your day so relived, ever wondered about the tech behind the audio.
Yes! Audio, not the visual aspect.
If you ever gave it a thought and wanted to explore what it is, this is your place to be. Not only we are covering the basics of Audio Codecs but also its various types which have evolved over the years.
Most conversations related to streaming automatically jump to the visual aspects. We all generally pay attention to the crisper photo quality and talk only on pixels. No one ever talks about the audio aspect. Even though it goes unnoticed, along with video, audio is equally important for a better experience.
Audio Codec:
The term codec is a portmanteau that combines the words “coder” and “decoder.” A codec is a standard or tool for encoding and decoding multimedia files.“RAW” or uncompressed audio files are recorded using techniques
that capture as much data as possible. This provides very high quality but
results in very large file sizes that aren’t practical for live streaming.
To make audio files smaller and easier to distribute, we use a
codec.
The first thing a codec does is encode an audio file. This
encoding involves tossing out extra information to reduce file sizes while
maintaining as much quality as possible. This process involves a sequence of
complex mathematical functions.
The second role of a codec is decoding, which is essentially
playing back an audio file that has previously been encoded. To make a complex
process very simple, this means reversing the math done during the encoding
step.
In short, an audio codec is a protocol for compressing digital
audio to save space and for playing back with the video.
Audio Codec can be a hardware circuit, or a software program
implemented using a certain algorithm.
Hardware Audio Codecs:
In hardware, an audio codec refers to a single device that
encodes analog audio as digital signals and decodes digital back into analog.
In other words, it contains both an analog-to-digital converter (ADC)
and a digital-to-analog converter (DAC) running off the
same clock signal. This is used in sound cards that support both
audios in and out, for instance. Hardware audio codecs send and receive digital
data using buses such as AC-Link, I²S, SPI, I²C,
etc. Most commonly the digital data is linear PCM, Linear pulse-code
modulation is a specific type of PCM in which the quantization levels are
linearly uniform. This contrasts with PCM encoding in which quantization levels
vary as a function of amplitude and this is the only format that most codecs
support, but some legacy codecs support other formats such
as G.711 for telephony.
Software Codecs:
In software, an audio codec is a computer program implementing
an algorithm that compresses and decompresses digital audio data
according to a given audio file or streaming media audio coding format.
The objective of the algorithm is to represent the high-fidelity audio signal
with a minimum number of bits while retaining quality. This can effectively
reduce the storage space and the bandwidth required for transmission
of the stored audio file.
Most modern audio compression algorithms are based on:
1. Modified discrete cosine transform (MDCT):
The modified discrete cosine transform (MDCT) is a lapped transform based on the
type-IV discrete cosine transform, with the additional property of being lapped:
it is designed to be performed on consecutive blocks of a larger dataset,
where subsequent blocks are overlapped so that the last half of one block
coincides with the first half of the next block. This overlapping, in addition
to the energy-compaction qualities of the DCT, makes the MDCT especially
attractive for signal compression applications, since it helps to avoid artifacts stemming
from the block boundaries, stemming from the block boundaries. As a result of
these advantages, the MDCT is the most widely used lossy
compression technique in audio data compression.
2. Linear predictive coding:
Linear predictive coding (LPC) is a method used mostly in audio signal
processing and speech processing for representing
the spectral envelope of
a digital signal of speech in compressed form,
using the information of a linear predictive model. It is one of
the most powerful speech analysis techniques, and one of the most useful
methods for encoding good quality speech at a low bit rate, and provides highly
accurate estimates of speech parameters. LPC is the most widely used method
in speech coding and speech synthesis.
Common Software Audio Codecs:
There is a wide range of audio codecs available today. However,
not all audio codecs are equally supported.
Some devices may support one audio codec, but not another. Some
provide better quality, while others focus on compression above all else.
These are important considerations when it comes to deciding on
the best audio codec for a given situation. Let’s go over a few of the most
common and best audio codecs.
1. MP3:
The most well-known audio format is probably MP3, which is
technically called MPEG-2 Audio Layer III.
Originally introduced in the 1990s, MP3 revolutionized digital
audio. Files were much smaller than the previous formats, allowing them to be
streamed and downloaded over the internet.
MP3 also helped push the era of portable digital music past the
CD era by enabling iPods and other early “MP3 players.” It is still widely used
today.
2. AAC:
Developed a few years after MP3, AAC built on the success of
that format but increased compression efficiency.
AAC generally provides better audio quality at the same bitrate
as MP3 or comparable quality at lower bitrates.
AAC has been upgraded several times. The latest version of the
standard is HE-AAC. It is a closed source format but is probably the most
widely used audio codec on the internet today. It is supported by most video
streaming platforms.
3. WAV (LPCM)
WAV, which is short for “Waveform Audio File Format,” was
originally released more than 25 years ago.
It is known to be primarily used on Windows computers to store
uncompressed audio in the LPCM format.
4. AIFF:
AIFF is a Mac format that’s similar to WAV. It stores
uncompressed audio using the PCM (Pulse-Code Modulation).
Like WAV, AIFF files are very large—around 10 MB for one minute
of a standard audio recording.
5. WMA:
Another codec on the market, albeit one that is becoming less
common, is WMA—Windows Media Audio. This codec was developed as an alternative
to MP3 but has become somewhat of a niche product.
6. Opus:
The final audio codec we’ll take a look at is Opus. Opus isn’t
in wide use yet, but it’s considered a next-generation codec. It provides
higher audio quality at all bit rates compared to every other codec listed
here. Opus also has the added advantage of being royalty-free and open source.
Both iOS and Android now natively support Opus playback.
We’ll likely see Opus getting wider use in the future.
Best Audio Codec:
We believe that AAC is the best audio codec for most situations. AAC is supported by a wide range of devices and software platforms, including iOS, Android, macOS, Windows, and Linux. Other devices such as Smart TVs and set-top boxes also support AAC.
Besides wide support, AAC also has the advantage of better audio quality compared to MP3. Blind listening tests generally show that AAC is the best codec available for general use.
This may change in the future as Opus becomes more broadly
supported. However, hardware and software changes move slowly. That day is
likely still a few years away.
For internet video, AAC is the best audio codec for live
streaming as well as video on demand. This is generally configured via settings
in your hardware and software encoder.




Comments
Post a Comment