Category Archives: computerMusic2

composition1 computerMusic1 computerMusic2 computerMusic3 max musth625 sonicArts

(sonicArts) online storage

I’ve been pushing iLocker in class, an online storage solution offered to all of you from Ball State. (I won’t call it free, given what you pay in technology and student services fees, not to mention tuition.)


If you don’t have a good FTP program, or otherwise know how to set it up on a computer that isn’t your own, it is ugly to use. UGLY.

So I would recommend Dropbox, or Box, or some other free online storage service. Make sure you put the file in your public folder, and copy the link to give to me.


computerMusic1 computerMusic2 musth625 sonicArts

digital performer intro, part 2

part one of the digital performer intro is here

Importing Audio and the Soundbites Window

You can drag and drop audio from a Finder window into the Soundbites pane in Digital Performer. Audio imported into your project gets converted to the project audio format, and copied into the Audio Files folder within your project folder. Soundfiles are turned into soundbites by Digital Performer, and listed in the pane. The first column is the Move handle (MVE), the second displays the soundbite name (the filename is not changed), and the third column is lists the soundbite duration. You can distinguish between mono and stereo files by the soundbite move handle: one tilde (~) is mono, and two tildes is stereo (indicated in the graphic below).


read more »

computerMusic1 computerMusic2 musth625 sonicArts

digital performer intro

Digital Performer and DAWs

Unlike stereo audio editors (Audacity, Peak, Audition), Digital Performer is an example of a Digital Audio Workstation (DAW). Digital Performer relies (mostly) on non-destructive processing and mixing. The program allows for multiple sounds to be used at once by reading from the multiple sound files, applying gain changes as indicated by mix commands, and applying processing through plugins. When you have completed a mix (or at any stage along the way), you “bounce” your project to stereo, which mixes and applies all processing to the individual tracks.

Since a DAW project has a more complex organization of files than a stereo audio editor, and a more complex set of preferences and setup, it is important to understand setup and file organization for a DAW project.

First Step – Starting a New Project

Launch Digital Performer. The default setting is to open a project or create a new project. Create a new project, and name it according to the assignment instructions (or whatever you want to call it). Creating a Digital Performer project creates a new folder, with a data file and audio files folder within. Other folders will get created as needed. For simplicity, you should plan on creating other folders to store your original and edited audio source files.

Studio Setup and Program Preferences

Once you have a project open, you should check your program preferences. There are a lot of sub-menus in the preferences window. Choose General | Audio Files. Here you set your file format for new projects, and the file format for the current project. My default choices are shown below. I recommend Broadcast WAVE, Interleaved (check box), and 16 Bit Integer sample format. The other important setting is in the Audio File Locations section. You should choose to “Always copy imported audio to project audio folder.” The other choice for processed files will be made for you.

read more »

assignments_cm2 computerMusic2 lectureNotes_cm2

(compMus2) The Final Project BOX

The box for turning in your audio CD of your final project is now on the table outside studio 9.



computerMusic2 lectureNotes_cm2

(compMus2) Spectral Quiz Review

For Wednesday’s (12/10) quiz over spectral processing, review the previous posts on intro to spectral processing, the Fourier transform, phase vocoding, and convolution.

Important Concepts

The quiz is not limited to the listed items below, but these concepts will go a long way towards helping you master the important material.

Intro to Spectral Processing

  • Audio domains (time, frequency)
  • Converting between time and frequency domain
  • Missing, or unspecified elements in each domain
  • The difference between time domain processes and frequency domain processes, with the ability to name some processes in each domain. (This last concept is drawn from all the posts, along with previous work we’ve done in class.)

The Fourier Transform

  • What the theorem states (i.e., the part about any periodic signal could be represented….)
  • The different implementations of the FT (FT, DFT, STFT, FFT), and how these implementations relate to each other
  • The FFT, specifically relating to the computational benefits of using an FFT size that is a power of 2
  • The Uncertainty Principle as it applies to the FT
  • FFT parameters (size, window type, bin, frame, overlaps, hop size)
  • The relationship of the FFT size to the number of frequency bands being analyzed
  • Problems with the FFT (FT): periodic, spacing of bands, time/frequency trade-off

Phase Vocoding

  • What audio manipulations/processes can be accomplished with phase vocoding
  • How time compression/expansion works with phase vocoding (also, be able to compare this to granular synthesis)
  • How pitch shifting works with PV.
  • How you overcome (to some degree) the time/frequency trade-off


  • Convolution as a fundamental process in digital audio processing
  • The musical uses of convolution
  • Be able to describe in words the basic process of convolution
  • Implementation of convolution (using spectral processing)
  • The Law of Convolution, and its usefulness for implementing convolution
  • Understanding how convolution works to filter signals and to apply reverberation.
computerMusic2 lectureNotes_cm2

(compMus2) Convolution

Convolution is a fundamental process in digital audio processing. Even if you do not specifically know that the process is happening, you know the effects of the process. Filtering, reverberation, and cross synthesis all illustrate convolution. For example, a filter convolves its impulse response (IR) with the input signal to produce filtered output. Sampling reverbs convolve impulse responses of physical spaces with input signals to produce the effect of playing a sound in the physical space.

Musical Uses of Convolution: Reverberation and Filtering/Cross Synthesis

Convolution can be used to simulate an arbitrary signal being played back in a specific physical space by sampling the impulse response of a space and convolving that IR with an arbitrary signal. Sampling a room requires a signal with (preferably) all frequencies. Often, a balloon popping or a starter pistol is used. The best way is to use a quick sine tone sweep. Convolution can be used to filter signals, either for cross synthesis purposes, or to simulate the characteristics of an audio system, such as a microphone or guitar amp.

The Math of Convolution

For fun, the equation:


convolution equation

convolution equation

For every sample in one signal (the arbitrary signal), multiply it by every sample in the IR b, and sum the results offset by each sample in a. The length of the output file will equal the length of signals a + b - 1 sample. Convolution is not multiplication. Multiplication in the time domain is amplitude modulation. (For each sample, multiply one sample from a times one sample from b.) Convolution of two audio signals is a series of multiplications, and a summation of those results. Each sample in one signal is multiplied by the entire set of samples in the second signal, offset in the original signal by the location of the sample being multiplied.

Implementation of Convolution

Implementing convolution in the time domain is very computationally expensive, and not practical as a process. To implement convolution as a digital signal process we rely on the Law of Convolution. The Law of Convolution states that convolution in the time domain is equal to multiplication in the frequency domain, and vice versa. Both signals are converted to the frequency domain via an FFT, and their resulting frequency spectra are multiplied.

Understanding convolution as multiplication of the frequency spectra is the easiest way to understand how convolution can be used to filter a signal. Shared frequency content between two signals will be resynthesized, but any frequency not found in both signals will be silenced (multiplying any number by 0 equals 0). Understanding convolution in the time domain is the easiest way to understand how convolution works as reverberation, since each sample in one signal will be scale and repeated for every sample in the other signal. The result of this operation is time smearing. It should be noted that however you understand convolution, the process of convolution is acting as both a filter and time smearing operation. Therefore, if your purpose is weighted towards filtering, your impulse should be very short. If you wish to simulate reverb, your impulse should be of a duration that matches with typical reverb times (0.8 and above, with a decaying amplitude envelope).

computerMusic2 lectureNotes_cm2

(compMus2) Phase Vocoding

Phase Vocoding allows for independent control of time duration and pitch. 

Time Expansion/Compression with Phase Vocoding

The conversion of an audio signal from the time domain to the frequency domain results in a series of frames containing bins of frequency and amplitude information. If you conceive of the FFT as producing a snapshot, a frozen picture of frequency/amplitude information for a short segment of time, then it is easy to understand time expansion/compression as similar to changing the frame rate of video playback. Individual pictures (the analysis frames) are not changed, only their rate of playback. 

Consider a simple math example. With an FFT size of 512 samples, each analysis segment lasts for approximately 11 ms. In the frequency domain, this 11 ms analysis segment represents one frame of frequency/amplitude bins. If during resynthesis (the inverse FFT) each frame is resynthesized at a rate of 11 ms per frame then the output signal is the same duration as the input signal. If the rate of frame resynthesis changes to 22 ms per frame (twice the original analysis duration), then the output signal will be twice as long as the original. If the rate changes to 44 ms per frame, then the output signal expands to four times the original length. This method of time expansion/contraction is completely analogous to slow motion (or fast motion) video. You are not adding more frames to the video playback when you slow down/expand time (like you would with granular synthesis); you are simply changing the playback rate of the frames you have already recorded/analyzed. 

Pitch Shifting with the Phase Vocoder

Phase vocoding shifts pitch through simple multiplication. You multiply the frequency information in all bins of every frame by the same transposition factor. Multiplying by two transposes the output resynthesis up one octave; multiplying by 0.5 transposes the output down one octave. If you are looking for precise semitone transposition you will need to calculate 2 to the-x/12th power, where x equals the number of semitones of transposition.

computerMusic2 lectureNotes_cm2

(compMus2) The Fourier Transform


In 1822, Jean Babtiste Joseph, Baron de Fourier developed the theorem any periodic signal could be represented as the sum of individual sine waves. The number of sine waves needed could be infinite, and each sine wave would have its own frequency, amplitude, and initial phase. The process of calculating the frequencies present in a signal is called the Fourier Transform. As mentioned in my previous post, using the Fourier transform converts a time domain audio signal into a frequency domain representation. 

This brief definition of the theorem gives us our first problem with the transform. The transform works on periodic signals. In fact, it assumes that the signal being transformed is periodic. Periodic signals not only repeat at regular intervals, they are infinite, which implies that the signals have no beginning or end. Setting aside the practical considerations of a signal that lasts forever, and has always existed, a regularly repeating signal doesn’t change frequency! 

A related problem is that the FT has no way of knowing when a particular frequency starts within the analysis time segment. If a frequency appears at all during the analysis segment, it is calculated as being present for the whole segment. If you were to apply an FT to the entire Rite of Spring, and then resynthesize the results, you would hear a single (very complex) chord from start to finish.

Means of Calculating the Fourier Transform

The earliest forms of FT calculation were done by hand. Mechanical springs gave way to analog filters, and finally, to computer analysis. Since any computer operation involves a discrete series of values (rather than continuous analog time), computer FT’s are Discrete Fourier Transforms (DFT).

Since the FT itself cannot distinguish the start time of a given frequency within the analysis segment, FT’s are usually applied to very small time segments, in a series. This process of analysis is called the Short-Time Fourier Transform (STFT). The STFT is not necessarily a digital process. However, all DFT’s use the STFT. 

The time segment used for calculation is taken by applying an amplitude window. This window, a very short amplitude envelope, is the same as what is used for granular synthesis. The window generally has tapered ends to eliminate the discontinuity between the end of the signal and its beginning (since the FT assumes that the signal is periodic). 

The Fast Fourier Transform (FFT)

Even using a computer, a DFT requires an enormous amount of computation and is not practical to use. The discovery of a mathematical trick finally made the DFT a usable process. It was discovered that if the number of samples in your STFT window were a power of 2, you could greatly reduce the number of calculations needed to perform the analysis. Hence, the Fast Fourier Transform (FFT) was developed. 

In the FFT, the size of the window in samples is the FFT size. The FFT size is equal to the number of analysis frequency bands evenly spaced between 0Hz and the sampling rate. You can calculate the frequency band spacing by taking the SR and dividing it by the FFT size. For example, with a SR of 44,100 Hz, an FFT size of 512 gives you a frequency band spacing of (44,100 / 512) = 86 Hz (approximately). If you used 1024 samples in your FFT, the frequency spacing would be about 43 Hz. 

Given that we perceive pitch in an exponential relationship to frequency, the linear nature of the FFT presents a problem. Generally, this problem is compensated for by using a larger FFT size, which reduces the band spacing.  Using 2048 samples yields a band spacing of about 21.5 Hz; 8192 provides a roughly 5 Hz spacing between analysis bands. 

The Uncertainty Principle

While it would appear to be preferable to use as large an FFT size as possible for better frequency resolution, such an assumption is not always correct. With the FFT there is a tradeoff between time resolution and frequency resolution, similar to Heisenberg’s Uncertainty Principle. Heisenberg found that the more you looked for the velocity of an object, the less you knew about its position, and vice versa. For the FFT, the more look for frequency, the less you know about time. This uncertainty arises because the Fourier Transform can not distinguish between a frequency that appears at the beginning of a transform window, and one that appears halfway into the transform window (or any other time within the window). Any frequency appearing at any time within the window is analyzed as being present for the entire window. Since you add samples to the window to increase frequency resolution, you are also adding a greater period of time that is being analyzed, and consequently lowering the time resolution of the analysis. 

For example, a 512-sample FFT window lasts approximately 12 ms (size/SR). Within that 12 ms window we lack time knowledge of events. If you double the window to 1024 samples, you double the time segment to approximately 24 ms. Each doubling of the window doubles the length of time for analysis, and halves our time resolution. At 4096 samples, our time resolution is reduced to approximately 93 ms, which is quite noticeable. 

To work around this uncertainty problem you typically use overlapping analysis windows. However, overlapping windows can add an echo-type effect to the re synthesis, and will thicken the sound.

FFT Parameters

  • FFT Size: The size of the analysis window, in samples. For the FFT, the size must be a power of 2. The size of the FFT will equal the number of frequency analysis bands, evenly spaced from 0 Hz to the Sampling Rate at multiples of SR/FFTsize. Half of the bands (up to the Nyquist Frequency) are usable.
  • Window Type: The short-time amplitude envelope applied to the segment of audio being analyzed by the FFT. In general, bell-shaped envelopes are best for analysis.
  • Bin: For one analysis segment, each frequency band being analyzed and its corresponding amplitude are represented together as a pair of numbers. This pair of numbers is a bin. Since the FFT size equals the number of analysis frequency bands, the FFT size will also equal the number of bins.
  • Frame: The collection of bins for one analysis segment. If the FFT size is 512, then there are 512 frequency bands being analyzed, and consequently, 512 bins in the frame. The frame corresponds to the audio segment being analyzed at any given point in time. For purposes of understanding time manipulation via phase vocoding, you can also think of the frame as the frequency snap shot of an analysis window. 
  • Overlaps: the number of overlapping analysis windows applied to the input signal. More overlaps can provide greater time detail.
  • Hop Size: the distance between the start of overlapping analysis windows. This hop size, or skip, is usually determined by spacing overlapping windows evenly at a distance of 1/#_overlaps times the FFT size.

FFT Problems (applies to all versions of the FT)

  • The FFT assumes the input is periodic, which implies infinity. Infinitely periodic signals don’t change pitch.
  • The spacing of frequency analysis bands is linear, while our perception of pitch is exponential.
  • The Uncertainty Principle applies to measurements of frequency and time. Larger FFT sizes give better frequency resolution, but worsen the time resolution, and vice versa. The FFT cannot distinguish start times of frequency components within a window.
computerMusic2 lectureNotes_cm2

(compMus2) Spectral Processing Intro

Audio Domains

Up until this point, we’ve been talking about audio processing and synthesis in the time domain. Spectral processing takes place in the frequency domain. In the time domain, we represent sound as changing amplitude (y value) over time (x value). In the frequency domain, sound is represented as changing amplitude (y value) over frequency (x value). Two things are worth pointing out at this point. One, the property of the x axis is your domain; and two, neither domain represents both time and frequency. If you’re representing time you know nothing about frequency. Likewise, if you’re representing frequency you don’t have any time information. 

Converting Domains

To convert from a time domain representation of sound to a frequency domain representation, you use a process called the Fourier Transform. To reverse the process and convert from the frequency domain to the time domain you use an Inverse Fourier Transform.

computerMusic2 lectureNotes_cm2

(compMus2) Granular Synthesis Review


  • Any sound can be thought of as containing discrete particles/time segments (grains)
  • Duration of an individual grain is short – usually 1 ms to 100 ms.
  • Within an individual grain, sound parameters are fixed. Change occurs as you progress from grain to grain.

Parameters of Individual Grains

  • Playback speed
  • Index location (location in soundfile used to create grain)
  • (maximum) amplitude
  • Grain envelope
  • Duration
  • Panning

Parameters of Grain Combinations (Macro Controls)

  • Frequency of grains (grains per second)
  • Fixed or random rate of grain production
  • Density of grains (the number of grains happening at one time)
  • Number of grain streams (can be related to density)

Windows (Grain Envelopes)

  • A window is a short-time amplitude envelope.
  • The window shape can be chosen to emphasize legato connections between grains, discontinuity between grains, or anywhere in between.

Overlaps and Streams

  • A stream is the individual series of grains occurring one after another.
  • Multiple streams involve overlapping envelopes.
  • Overlapping envelopes generally produce a smoother amplitude output.

High-Level (Macro) Organization

  • The number of parameters to control, and the number of grains per second, require some type of macro control.
  • Pitch-Synchronous organization analyzes the sound file ahead of time to set parameters so that a specified pitch will result. The parameter settings of individual grain parameters are linked. Kontakt tone machine uses pitch-synchronous organization.
  • Asynchronous organization means that all grain parameters are specified independently of each other. Control functions are usually specified to change parameters over time. 
  • Quasi-synchronous organization indicates that some, but not all, parameters are linked. It is the most common organization offered in the programs we use (Cecilia and Kontakt time machine). Most often, grain duration determines the frequency of grains, as grains are created in succession. This organization leads to a type of AM synthesis.