AUDIO (WAV) FILES

 

A. Various WAV files used in class

Sampling rate: 11025Hz, 8-bits/sample.
"Your work is ingenious..."

Sampling rate: 11025Hz, 8-bits/sample.
"How soon can you land?..."

Sampling rate: 16000Hz, 16-bits/sample.
"Guitar note: Standard A above middle C (440Hz pitch) played at 5th fret, high E string"

Sampling rate: 16000Hz, 16-bits/sample.
"Short blues phrase"

Sampling rate: 16000Hz, 16-bits/sample.
"1999"

Sampling rate: 16000Hz, 16-bits/sample.
"Sur Vesdre"

Sampling rate: 44.1kHz, 16-bits/sample.
"Purify"

Sampling rate: 44.1kHz, 16-bits/sample.
"Lick"

DATA FOR MATLAB PROJECT II
Sampling rate: 44.1kHz, 16-bits/sample.

"allplusaaa" "aaanote"

B. Digital audio effects

The following allpass reverb filter (Lesson 24) was implemented in Matlab as follows:
[x,Fs,bits]=wavread('duet_dry'); duet_dry
b=[-0.02 zeros(1,3000) 1]; a=[1 zeros(1,3000) -0.02];
y=filter(b,a,x);
sound(y,Fs)
wavwrite(y,Fs,bits,'duet_rvb'); duet_rvb
Filter h(n) plot

These were produced by playing a guitar through a Korg G3 guitar effects pedal.
Chorus1 Chorus2
Flange Flange22
Reverb Reverb2
Here is another example of the chorusing effect: Snowmen Chorus

These reverb effects were produced using a Digitech RP350 guitar effects pedal. Use them as a reference for Matlab Project I.
No Reverb
Room Reverb
Concert Hall Reverb


C. Speech synthesis examples

This is the original speech from the movie Amadeus. Sampling rate: 11025Hz, 8-bits/sample. Hence data rate = 88kbits/s.
Original 8-bit PCM data: "Your work is ingenious..."

Next, each data frame is synthesized using an IIR filter whose coefficients were computed from the speech data. The filter is excited using an impulse train with constant spacing between impulses, simulating speech at a constant pitch. (Robotic sounding.)
Result of linear predictive coding with impulses giving completely voiced speech at a constant pitch period

Here, white noise is used as the excitation. The whisper-like synthetic speech is perfectly intelligible.
Result of linear predictive coding with white noise giving completely unvoiced speech

Here we have LPC as it was intended, with mixed excitation and pitch resulting from software that makes voicing decisions and detects the pitch period of voiced frames. Compressed data rate ~4kbits/s.
Result of linear predictive coding with voicing decisions and variable pitch

Here the LPC code that was used to synthesize the Amadeus speech is applied to speech from Sean Connery in Hunt for Red October. The quality is poor, even though both speech signals use the same sampling rate of 11025Hz. The pitch of the two actors' voices is quite different, and this likely affects the results. The challenge is to write LPC code that is not so data dependent.
Here we have LPC as it was intended, with mixed excitation and pitch resulting from software that makes voicing decisions and detects the pitch period of voiced frames. Compressed data rate ~4kbits/s.
Second input signal
Result of linear predictive coding with voicing decisions and variable pitch, applied to the second input signal

D. The sound of aliasing

This is the original speech from the movie Amadeus. Sampling rate: 11025Hz, 8-bits/sample. Hence data rate = 88kbits/s.
Original 8-bit PCM data: "Your work is ingenious..."

After downsampling by a factor of 2. (Shown in Lesson 19 as a box with a downward arrow followed by the number 2.) After discarding every other sample, the new sampling rate is 11025/2 = 5512.5Hz. Notice how aliasing distorts the high frequencies, notably the "s" sounds. The MATLAB code was:
[x,fs,bits]=wavread('cutafew');
y=upfirdn(x,[1],1,2);
wavwrite(y,fs/2,bits,'downsample_by_2');

Aliased after X2 downsampling

After downsampling by a factor of 4. (Shown in Lesson 19 as a box with a downward arrow followed by the number 4.) After discarding 3 out of every 4 samples, the new sampling rate is 11025/4 = 2756.25Hz. The aliasing distortion is more pronounced.
[x,fs,bits]=wavread('cutafew');
y=upfirdn(x,[1],1,4);
wavwrite(y,fs/4,bits,'downsample_by_4');

Aliased after X4 downsampling

After downsampling by a factor of 8. (Shown in Lesson 19 as a box with a downward arrow followed by the number 8.) After discarding 7 out of every 8 samples, the new sampling rate is 11025/8 = 1378.125Hz. The intelligibility of the entire phrase is now severely compromised.
[x,fs,bits]=wavread('cutafew');
y=upfirdn(x,[1],1,8);
wavwrite(y,fs/8,bits,'downsample_by_8');

Aliased after X8 downsampling

Finally, it's fun to listen to the speech after downsampling, but played back at the original rate of 11025Hz. You will hear three consecutive renditions (with short pauses separating them) of data downsampled by 2, 4 and 8. Only the first is intelligible in a chipmunk-like way. Perhaps humming birds can understand the last two!
The Chipmunks do Mozart

This example illustrates Lesson 3, page 4. Using a fixed sampling rate of 6kHz, the input CT sinusoid is swept in frequency from 0-6kHz (a "chirp" signal) during a 10s time window. No aliasing occurs from 0-3kHz. From 3-6kHz, we hear an alias frequency that is the reflection (or folding) of the input frequency about the "wall" that occurs at 3kHz (the folding frequency). As seen in the notes, this alias frequency swoops down from 3kHz down to 0. The code that generated these WAV examples is:
function aliasplay(Fs,tmax,Fmax)
% function aliasplay(Fs,tmax,Fmax)
% Listen to aliasing as sinusoid freq is swept as a ramp function
% Fs - sampling freq
% tmax- length of sweep
% Fmax - height of ramp
T=1/Fs;t=0:T:tmax;freqin=(Fmax/tmax)*t;
x=chirp(t,0,tmax,Fmax); plot(t,freqin);xlabel('secs');ylabel('Hz');title('Input freq vs time');grid
wavwrite(x,Fs,16,'alias');
sound(x,Fs)

0-6kHz chirp signal sampled and played back at a sampling rate of 6kHz

This is the same as the previous example except that the input frequency is swept from 0-12kHz in 10s, so we hear the output frequency go up and down twice.
0-12kHz chirp signal sampled and played back at a sampling rate of 6kHz

E. The sound of quantization noise (few bits/sample)

This is the original speech from the movie Amadeus. Sampling rate: 11025Hz, 8-bits/sample. Hence data rate = 88kbits/s.
Original 8-bit PCM data: "Your work is ingenious..."

MATLAB writes WAV files with a minimum of 8-bits/sample, so I concocted the following code to quantize my data into 4 bits/sample (that's only 16 quantization levels).

[x,fs,bits]=wavread('cutafew');
y=double(uencode(x,4));
y=y/max(y);
y=2*(y-0.5);
d=x-y; % Quantization noise
plot(d)
sound(y,fs) % Listen to 4-bit speech
sound(d,fs) % Listen to quantization noise
wavwrite(y,fs,8,'4bit_speech');
wavwrite(d,fs,8,'4bit_quantization_noise');

The quantization noise is by no means totally random. Parts sound like wideband (~white) noise in the background of the signal. Between words, the noise has a structured (buzzing) quality as the quantizer cycles between levels adjacent to 0.
Speech quantized to 4 bits/sample

Listen to just the quantization noise. Clearly, using a white noise model for quantization noise is a fairly crude approximation.
4-bit quantization noise

F. Sounds of filtered white noise

All filters were created using the MATLAB function FIR1. Each filter contained 1001 coefficients, yielding a sharp cutoff at the band edge(s).

T
his is 0-4kHz white (flat power spectrum) noise, played at 8kHz sample rate, and 8-bits/sample. (No filtering.)
White noise

White noise filtered using a 0-2KHz bandwidth lowpass filter. Notice the perceptible 2kHz whistle. Sharp-cutoff audio filters are known to generate such "ringing" at the band-edge frequencies.
0-2kHz lowpass noise

White noise filtered using a 0-400Hz bandwidth lowpass filter. Good wind-sound synthesizer?
0-400Hz lowpass noise

White noise filtered using a 400Hz bandwidth bandpass filter centered on 2kHz (i.e. 1800-2200Hz). Notice the rather complex whistling with potential components at 2kHz as well as the band edge frequencies.As the noise bandwidth increases, this tonality becomes less obvious and the crackle/hiss quality grows.
1800-2200Hz narrowband noise

White noise filtered using a 2-4KHz highpass filter.
2-4kHz highpass noise

G. Filtering a fetal heartbeat signal

This original fetal doppler signal contains significant noise. (Fs=22050Hz, Nbits=16).
Fetal doppler
A length-1001 FIR filter was designed after examining the power spectrum of the original signal, using the code: P=psd(x); semilogy(P). Rather than me tell you the type and bandwidth of the filter used, download the original signal and try designing your own filter. (Hint: I used the MATLAB commands FIR1 and CONV.)

Filtered doppler Plot of spectra and filter used

H. Measuring vocal pitch

Excerpt from Mariah Carey singing the national anthem at the 2002 Super Bowl. (Fs=44100Hz, Nbits=16.)
Land of the free
Very short excerpt from her high note "...land of the FREE." Vocal pitch measured at 1838Hz, corresponding to a note of ~A6# (1865Hz). The pitch measurement technique will be discussed in Lesson 9. See Autocorrelation plot .
FREEEEEE

I. Sampler, compressor, expander (Lesson 19)

This is the original signal. Sampling rate: 11025Hz, 8-bits/sample.
"How soon can you land?..."
After "sampling" i.e. setting all odd numbered samples to zero. Playback rate: 11025Hz, 8-bits/sample.
Sampled
After "compression" i.e. discarding every other sample with no zero-insertion. Playback rate: 5512.5Hz, 8-bits/sample.
Compressed
After "expansion" i.e. inserting a zero between all samples. Playback rate: 22050Hz, 8-bits/sample.
Expanded
Power spectra computed from the above files (Compare the shape of these with the plots in the notes for Lesson 19.)

J. Simple sample-differencing filter & variable delay echo filter (Lesson 8)

See Lesson 8, page 5

Original audio signal. Sampling rate: 16000Hz, 16-bits/sample.
"1999"

LDE of adjacent-sample-differencing FIR filter: y(n) = 0.5x(n) - 0.5x(n-1). Listen to how the low frequencies have been reduced, and the high frequencies emphasized.
"Sample-difference filter output"

LDE of long-echo-FIR filter: y(n) = x(n) + 0.5x(n-DELAY). The DELAY (in samples) is equivalent to 250ms or 500ms. The 250ms echo could perhaps be used as a cheap digital audio effect? The 500ms echo is too long for musical applications.
"250ms Echo Filter output"
"500ms Echo Filter output"

K. Aliasing of sinusoidal chirp signal (Lesson 39, p. 3)

Sampling rate: 8000Hz, 16-bits/sample. The frequency heard goes up, down, up and part-way down, as predicted by the graph in Lesson 39, p. 3. The Matlab code is:
Fs=8e3; T=1/Fs; t= 0:T:10;
x=chirp(t,0,10,14e3);
sound(x,Fs)

"Chirping"

Last Modified: April 24, 2009

Date format:Am1 -->March 4, 2005

l> ml> > ml> >