A technical discussion on the mp3 format (not the illegal download debate)

skunky_funk · October 12, 2005

So can anyone tell me how I can explain this to a bunch of Sound Design students of mine in layman's terms?

To start with my proposed lecture I'll tell them: we commonly use PCM WAV as our recording format. Hence we can explain how a digital waveform is formed. So a CD has 44.1Khz Sampling rate, and a bit resolution of 16-bit. So that means the digital signal is actually composed of samples that are discrete. And in effect you have 44,100 samples in one second... now as far as bit-depth is concerned, that means a digital sample can be represented in 2^16 different positions (or volume levels), from -32768 to 0 to 32768.

I did lecture on amplitude, wavelength, and all those other physics stuff before.

But now, they want me to explain the mp3 format.

I am no computer programmer, but the way I understand it, mp3s have two basic parameters: bitrate and sampling frequency. Why does mp3 sound very close to CD quality despite its small size? I understand that the idea of mp3 is getting a collection of all digital samples, and picking up the samples that represent the loudest frequency among the co-existing frequencies in a particular time? I checked out www.howstuffworks.com but I still find it inadequate. I too still have a problem comprehending.

Can anyone help?

Will Chen · October 12, 2005

I'm not sure exactly how it works, but my understanding is similar to the pic you posted. Think of it kinda as a frequency gate. If there is a bunch of activity (volume) around 1K, the highs and lows are attenuated. Part of the loss is hidden by the masking of the louder frequecy response of the mids, but as you compress more and more you will begin to hear bits drop as more of the masked frequencies are attenuated.

CrazyEdo · October 13, 2005

From articles I've read in the past, my understanding is that the mp3 encoding performs 2 main operations:

1-It will compress data

2-It will remove data that seems to be less useful

Operation 1: Compress Data

--------------------------

To compress data is the easiest operation to do. It can be compared to zip a file so it gets smaller. There is no loss of data in this process.

Here is a naive way to look at compression. Let's say you have a text file you want to compress. In the text file, there is 20 times the word "recording". The encoder will replace "recording" by the character % and write a note in the encoded file that % means "recording". Since the word is repeated often, the size of the file is gonna be reduced. It is much more complex than that in reality, but that analogy helps to understand.

And since audio wave files have specific and common particularities, there are some optimal ways to compress their data. But that alone is not enough to get a 12:1 ratio on a wave file.

Operation 2: Remove useless data

--------------------------------

In audio data, there is a lot of stuff going on, and in a lot of different frequencies. But due to human hearing limitations, all of it cannot be actually heard, although all this information is kept in the wave file. Unlike the compression process previously discussed, there is some lost of audio data in this operation. From here, it will get a little more complex.

The mp3 encoder splits the audio track in a couple of frequency bands, often 32 bands. At each small interval of time in the song, it will check the loudness of these individual bands and also their relative relevance compared to the other ones. If there is something loud happening in some frequency bands, the encoder is gonna keep more bits of data for these. The other bands are gonna be allowed a lot less bits, since their information is less relevant.

So here is another cheap analogy. If you encode a wave recording of walksteps during a thunderstorm, the walksteps will sound less good when lightning strikes with loud noise since the footsteps noise is a lot less relevant sound at that period. But the footsteps are gonna sound just fine when there is no lightning strike.

This is complete vulgarisation, but I thought this could help you or anybody else.

CrazyEdo

skunky_funk · October 17, 2005

Originally posted by wbcsound

I'm not sure exactly how it works, but my understanding is similar to the pic you posted. Think of it kinda as a frequency gate. If there is a bunch of activity (volume) around 1K, the highs and lows are attenuated. Part of the loss is hidden by the masking of the louder frequecy response of the mids, but as you compress more and more you will begin to hear bits drop as more of the masked frequencies are attenuated.

OK, so where does the difference in bitrates come in and why do higher bitrates sound more CD-like than the lower bitrates?

Will Chen · October 17, 2005

Originally posted by skunky_funk

OK, so where does the difference in bitrates come in and why do higher bitrates sound more CD-like than the lower bitrates?

sample frequency(44.1) * bit depth (16) * 2 (stereo)=Bits per second(1,411,200 for CD quality)

Reducing bitrate means more data compression at the sample frequency or the bit depth.

For an in depth definition, check out this site

Sign In

A technical discussion on the mp3 format (not the illegal download debate)

Recommended Posts

skunky_funk

Link to comment

Share on other sites

Will Chen

Link to comment

Share on other sites

CrazyEdo

Link to comment

Share on other sites

skunky_funk

Link to comment

Share on other sites

Will Chen

Link to comment

Share on other sites

Archived

Browse

Activity