(max) spectral processing in max

Starting with version 5, Max introduced the pfft~ object, which greatly simplifies spectral processing (fft-based processing). I’ve uploaded a folder of patches, fftstuff.zip, that illustrates some basic spectral patches and processes.

The pfft~ object is like the poly~ object, it uses a spectral subpatch to perform the FFT/iFFT and processing. The pfft~ object lets you set the FFT size and number of overlaps, creating the appropriate number of instances with sample delays. These example patchers are very similar to the MSP tutorial on pfft~.

get out what you put in

The first patch to take a look at is spectralStuff2.maxpat. It is setup to provide looping playback from two buffers. The groove~ objects and controls are inside the p subpatches groovePlayer1 and groovePlayer2. Only one groove~/buffer~ runs through a spectral process, fftPassThru~. fftPassThru~ is the simplest of spectral patches. It merely has an fftin~ and fftout~. Input audio should output exactly the same.

The pfft~ object specifies a frame size of 1024, with 4 overlaps. Inside the fftPassThru~, you specify fftin~ inputs (with numbers corresponding to the inlet). Audio coming into an fftin~ will be converted from time domain to frequency domain. That conversion creates a stream of real and imaginary values for each frequency bin. The audio rate nature of max processing means that these bin values are transmitted at the same sampling rate specified. An fftout~ performs an ifft process, converting the real and imaginary numbers back to an audio signal. You can also specify non-fft inputs and outputs to send other types of data into and out of a pfft~ subpatch.

simple convolution

spectralStuff3.maxpat shows a simple convolution process. Since convolution multiplies amplitudes of frequency bins together, you must have two audio signals playing to hear any output.

The convolve~ patch is still quite simple. Two channels of audio come in and get converted to frequency domain signals. The real and imaginary numbers represent x and y, a cartesian plot of incoming sample values. To perform convolution, you must convert cartesian coordinates to a polar representation as amplitude and phase angle with the cartopol~ objects. You then multiply those signals, amplitude multiplied by amplitude and phase angle multiplied by phase angle. The results of the multiplications are converted back to x,y values (real and imaginary) with poltocar~ (polar to cartesian), and then sent out the fftout~.

spectral noise gating

spectralStuffNoiseGating.maxpat demonstrates simple noise gating. Only frequency bins with an amplitude above the gate value will be resynthesized. noisegate~.maxpat has an example of both fftin~ and in~ inlets in the same subpatch.

frequency crossover

spectralStuffFrequencyCrossover.maxpat illustrates routing of output based on frequency content of input signal. The third outlet of an fftin~ object is the FFT bin index. You can multiply this bin index by the fundamental frequency of the FFT to get the frequency value for the bin. Using dspstate~ to get the sampling rate and fftinfo~ to get the size of the FFT, you can find the fundamental frequency of the FFT. Note that dspstate~ outputs information whenever audio processing is turned on, but fftinfo~ only outputs information when the subpatch is running inside a pfft~ object.

The result of the frequency comparison is a 1 or 0, which controls a gate objects that route the audio signals. You have to add 1 to the comparison result, as 0 will close the gate entirely — 1 or 2 will route input to the corresponding outlet. Note that you don’t have to perform cartopol and poltocar conversions inside the pfft~ subpatch, as you are getting frequency information from the bin index and simply using that information to determine output of each bin.

I use this process in my laptop performance pieces (Bent MetalThrown Glass, and Forced Air – In/Ex) in an expanded form. I route the signal to 8 outputs based on frequency content. You can easily modify the crossover patch to do this by sending the frequencies higher than the first cutoff to a second frequency comparison, and so on. Once you have routed output based on frequency you can apply time-based processes to the resulting audio bands, such as delay, amplitude modulation, panning, etc.

pitch shifting with gizmo~

I’ve used freqshift~ before, but since it uses ring modulation to shift pitch it introduces a fair amount of distortion to the frequency spectrum. gizmo~ is an FFT-based pitch shifter that shifts pitch according to a transposition ratio (similar to controlling the playback speed of groove~). The parent patch is FFTPitchShift.maxpat. The subpatch (pitchshiftgizmo~.maxpat) implementation is simple. The only wrinkle is the addition of a route object that looks at data types rather than the first item in a list to route data to separate outlets. Route is being used in the subpatch to filter out non-number messages from reaching the gizmo~ object (just as a programming safety valve).

The parent patch includes a section to use MIDI note input to determine the transposition ratio. You first set a base key with the keyboard slider (kslider). Middle C, 60, is the default base key. You can then play notes on a MIDI keyboard to determine the transposition ratio. The interval between the played note and the base key is determined, then sent to an expr object. Expr evaluates math expressions in C-like syntax. You have to declare variable data types and what inlet they will be coming in. $f1 declares that floating point numbers will be coming in the first inlet. Using the interval size in a 2-to-the-x/12 power equation gives a transposition factor. You can use this same equation for playback ratios with groove~. (Before, I had converted the base note to frequency, the incoming midi to frequency, and created a ratio based on those two numbers. The result is the same.)