Audio Processing and Filtering Concepts
Normalization and Amplification
These are both versions of the same basic function. They both uniformly change the selection. For amplification, you pick how much the volume is increased. Some programs also allow negative values, so that lowers the volume, but, in either case, the same change is applied uniformly over the selected portion of the audio.
Normalization is a special type of amplification. The idea here is to uniformly alter the entire audio file such that the loudest peak in the file is at maximum volume (or else something slightly below that). That is, the entire file is amplified by the same amount to get the highest peak at maximum volume. (One can "normalize" a portion of a file, but that would be for some special effect.)
Compressors and Limiters
N.B.: This type of compression is completely different from the compression applied to files to make them smaller, more compact, such in the MP3 file format or in zipped archive files. Here, we are dealing with a method of processing the audio content.
Compressors
Whereas amplification, and the related normalization, act uniformly by changing the volume level of the entire selected portion of the audio by the same amount, a compressor acts differentially on various parts of the audio. The purpose of a compressor is to reduce the dynamic range (the difference between the highs and lows) by reducing the volume level of those parts of the audio which exceed some threshold value. A good compressor lets you control both the threshold and the degree of reduction (referred to as the "compression ratio", with a higher ratio applying stronger compression). Below the threshold, the amplitude (ie, volume level) is not changed. But above it, the amplitude is reduced according to the ratio that's set.
See the Examples section below for screen shots of two sets of before and after wave forms.
Limiters
A limiter is a special type of compressor. The goal of a limiter is to prevent the amplitude from exceeding the threshold; it does this by having a very high compression ratio.
Silence Detection
One of the things proof-listeners look out for is long pauses. Since what qualifies as "too long" for a pause depends on the context, correcting pauses which are too long may not be amenable to automation.
Cool Edit, and likely Audition, has a Delete Silence tool: it can be set to look for gaps exceeding a certain amount and then trim those gaps to a specified length. For example, look for gaps longer than 5.4 seconds and shorten them to 2.3 seconds.
It appears that Audacity does not have this functionality.
Examples of Applying a Compressor
Cool Edit 2000
A rather good tutorial of the application of concepts covered here is "Adobe Audition Tutorial for speech recordings", which also applies to Cool Edit: http://www.tagnet.org/aim/adobe-audition-tutorial.html
It is used by one LibriVox volunteer (the original author of this document) to post-processed his recordings with Cool Edit 2000.
Example 1: Voice Characterizations
One benefit of the speech compressor is to compensate for that fact that this volunteer's male voice characterizations tend to be rather louder than his female voices (comments may be posted to the LibriVox knitting forum). The first step is to normalize the audio. It is believed this is necessary in order to bring the audio peaks within threshold range of the specific settings of the compressor. Then the speech compressor is applied. Since this reduces the dynamic range, the maximum volume level in the audio will now be noticeably below that which existed before compression. (The tutorial advises re-normalizing the file after compression, but the volunteer leaves the final volume adjustment to MP3Gain.)
Screen shot showing wide variation between voice characterizations:
The same audio sample after normalization and speech compression:
Example 2: Normal Voice Variation
This example is the first 32 seconds of a LibriVox recording by another volunteer (and used with her permission). For example, in the screen shots below, look at the portion of the wave just to the right of the vertical yellow dotted line. Here the reader says "The Adventures of Pinocchio". The syllables "Ad-Ven" are loud, then she drops her voice for "Pinocchio". As can be seen in the "after" image, the compressor evens that out.
Screen shot showing variations in normal reading:
The same audio sample after normalization and speech compression:
Another benefit of this speech compressor is that it helps tame pronounced plosives.
Audacity
An experiment with Audacity (2006 March) to attempt to reproduce the speech compression process described above was a failure. It used a single 32-second long audio sample. Audacity's normalization function did not seem to have much effect. Its compressor is adjustable for ratio but not threshold, and it didn't seem to do much to the sample, either.
However, a thread in the forums on this topic recommends a "great article that explains how to use compression in Audacity".
