Somehow this didn’t make it to the blog after the lecture…
Classic “concrete” techniques
- With classic tape techniques, the only way to change the duration of a recorded sound is to change the speed of the tape. (Which also changes pitch.)
- Same is true if all you want is a pitch change. (Duration changes)
Computer processing techniques
Software offers two different options for changing duration independent of pitch:
- Granular Synthesis, a process that slices (windows) time domain audio into very small (1 – 100 ms) segments, and
- Phase Vocoding, a process that converts time domain audio into frequency domain representations.
Converting Domains
Any arbitrary periodic signal can be represented as a sum of many simultaneous sine waves.
Fourier Transform
- Converts a time-domain representation into a frequency domain representation
Inverse Fourier Transform
- Converts a frequency-domain representation into a time domain representation
Fast Fourier Transform (FFT)
- The FFT takes a slice of time (a window) that is n samples in length, where n = some-power-of-2.
- The number of samples in an FFT window = the number of frequency bands between 0 Hz and the Sampling Rate.
- Only half the bands are usable. (why?)
How Phase Vocoding Works
- Each FFT window represents a frame, or still picture, of analysis information (frequency domain content)
- Time compression or expansion involves changing the playback rate of the frames (the conversion of frequency domain to time domain), which takes place during an inverse Fast Fourier Transform (iFFT)
- Like changing the playback rate of film or video.
- Pitch Shifting is an independent process.
- X times all frequency bands (2 = octave up; 0.5 = octave down.
Phase Vocoding parameters
- FFT size (window size)
- Determines number of frequency bands
- Determines length of time per analysis window ( FFT_Size / SR = Length in seconds)
- Number of Overlaps
- Determines onset of windows
- Helps with time resolution
- Window type
- Describes the amplitude envelope applied to each time window
- Can affect accuracy of measurements
- For now, you can stick to a Hamming window
- Time Scale (constant or function)
- Pitch Scale (constant or function)
Problems with the Phase Vocoder
- Frequency/Time trade-off – the Uncertainty principle
- the more accurate you are measuring one parameter, the less accurate you are measuring the other.
- Larger FFT size provides more frequency bands, but less information about start time of events, and vice versa.
- Frequency bands are linearly spaced, but our perception of pitch is logarithmic.
- Fourier Transform theory assumes a periodic signal.
- Periodic signals have no beginning or end (infinity in both directions)
- Implied in this assumption (as it relates to the FFT) is that a signal begins its period at the beginning of an analysis window, and that the end of the analysis window is a period end point of the signal. Windowing corrects for the unlikelihood of this happening.
Leave a Reply