Envelope generators—ADSR Part 2

Certain aspects of the ADSR are up for interpretation. Exactly how the ADSR re-triggers when not starting from idle, for instance. Also, we can decide whether we want constant rate, or constant time control of the segments. The attack segment is fixed in height, so starting from idle (0.0 level) to peak (we’ll use 1.0 in our implementation—feel free to multiply the output to any level you want), there is no difference between constant rate and constant time. But for the decay and release segments, the distance traveled is dependent on the sustain level. I choose constant rate—this means that the release setting is for a time from the maximum envelope value (1.0) to 0.0; if sustain is set mid-way at 0.5, the release will take less time to complete than if it were at 1.0, but the rate of fall will be the same.

ADSR-2

Time

There is one problem about how much time it takes for an exponential move: In theory, it takes forever.

That’s OK for decay and release, since it will become so tiny after a point that there is no difference between the near-zero output and zero. For attack, however, it’s a big issue. We don’t move to the decay state until the attack state hits the maximum at 1.0. If it never hits, we’re in trouble. No problem—we just aim a little higher than 1.0, and when it crosses 1.0 we move to the decay state.

In practice, floating point numbers have limitations and it won’t take forever (depending on the exact math used). But we’d spend way too much time making unnoticeably tiny moves as we get close to 1.0; certainly, we want to clip the exponential attack. Hardware envelope generators do the same thing, hitting a trigger level to switch to the decay state.

But there’s an aesthetic reason to clip the attack exponential as well. There are a number of reasons that the shape of “decaying” upwards towards a maximum value is the wrong shape for a volume envelope. It’s likely that the popular attack curve for hardware generators was a compromise between an acceptable shape and convenience (it’s easy to generate).

Math

While the math of the curve itself is simple due to the nature of generating a exponential curve iteratively (we just feed back a portion), the math of relating rates to coefficients is a bit more work. I decided that instead of arbitrary (“1-10”) control settings for the rates, I wanted to set the segments by time (in samples) for the segment to complete. So, a setting of 100 would make the envelope attack from 0.0 to 1.0 in 100 samples. A setting of 100 for release would hit 0.0 in 100 samples if sustain were set at maximum, 1.0, but sooner if it were set lower—constant rate, not time.

Again, we need only clip the attack segment, since we wouldn’t notice the difference between a full exponential and a slightly clipped one for the decay and release segments. However, putting a cap on those segments has some advantages. One is that we use a little less processing for envelopes in the idle state. But there’s another trick I chose to implement, after experimenting a bit and giving thought to exponential versus linear segments. Read on.

Shape control

One way to handle terminating exponential segments would be to set a threshold that is deemed “close enough”. For instance, as the attack segment gets within, say, 0.001 of 1.0, we simply set the output to 1.0 and move to the decay state. But if we take a slightly different approach, shooting for a target of 1.001 instead, and terminating when it hits or crosses 1.0, then we won’t have a noticeable step if we decide to make that “overshoot” value much larger.

Why would we want to make it much larger? That would wreck our nice exponential. In fact, if we make it too large, the moves would be virtually…linear! So, we could use a small number to stay near-exponential, or move to larger numbers to move towards linear segments, giving us a very versatile ADSR with curve control very simply.

More math

Usually, the rate of an exponential is based on the time it takes to decay to half the current value. But since we’re truncating our exponentials, I chose to define our rate controls as the time it takes to complete our attack from 0.0 to 1.0, or the decay or release to move from 1.0 to 0.0 (which happens for decay when the sustain level is set to 0.0, or release when sustain is 1.0).

Exponential calculations can be relatively CPU intensive. But, fortunately, envelope generators produce their output iteratively. And an iterative implementation of the exponential calculation is a trivial computation. We can calculate the first step of the exponential, whenever the appropriate rate setting is changed, then use it as the multiplier for subsequent iterations (remember the one-pole filter? Simple multiplies and additions).

rate = exp(-log((1 + targetRatio) / targetRatio) / time);

where targetRatio is our “overshoot” value, the constant 1 is because we are moving a distance of 1 (0.0 to 1.0, or 1.0 to 0.0), and time is the number of samples for the move. Typically, we’ll pick a targetRatio that we like (a larger number to approximate linear, or a small fraction to approach exponential); time comes from our rate knob, perhaps via a lookup table so that we can span from 1 sample to five or ten seconds worth of samples in a reasonable manner.

I chose to refer to the overshoot adjustment as a “target ratio” because it’s related to the size of the move, and it’s helpful to think of it in terms of dB, especially related to loudness. For the release to decay to -60 dB, for instance, we use a value of 0.001; -80 dB is 0.0001. Using smaller values will have a significant effect on how “wide” the top of the attack segment is, at a given attack rate setting. To convert from dB to the ratio value, use 10 raised to the power of your dB value divided by 20; in C, “pow(10, val / 20.0)”.

Another pleasant byproduct of using this “target ratio” (overshoot) for all curves is that we don’t need to guard against output decaying into denormals and ruining performance.

Next: Code

Up next is source code. As usual, I try to keep the code simple yet flexible. You may want to add features to auto-retrigger the envelope for repeating patterns, add delay and hold features, or add additional segments—that’s where you distinguish yourself. And this ADSR has features lacking in many implementations already—adjustable curves, and precise specification of attack time in samples, for instance.

Posted in Envelope Generators, Synthesizers | 17 Comments

Envelope generators—ADSR Part 1

After discussing the exponential decay of the one-pole filter in a recent article, I couldn’t help but think about envelope generators. Besides, it would be handy to have one to test out some of these other components I’ve been writing about.

The staple of analog synthesizers is the ADSR envelope generator. Sure, when creating an envelope computationally, we don’t have the limitations of hardware, so we could make envelopes of unlimited complexity. But the abundance of “virtual analog” plug-ins attest to the fact that the simple ADSR is a very effective design—and mandatory if we’re simulating most vintage gear. And it’s more manageable to explain how to make an ADSR, and let you take it from there if you wish.

States and gates

Envelope generators are defined by their states. Typically, an envelope generator of a synthesizer at rest is in an idle state, outputting zero level. When you press a key, it raises a gate to the “on” state, triggering the envelope generator into its initial attack state. The output of the envelope generator begins to rise at an attack rate determined by a user control. As long as the key remains down, keeping the gate “on”, the envelope rises until it reaches its maximum output. At that time, it switches to the decay state, and moves towards the user-defined sustain level (set anywhere from zero to maximum output) at the rate set by another user control. Upon reaching the sustain level, the envelope generator switches to the sustain state, and remains there as long as the key is held. Once the key is released and the gate changes to “off”, the envelope generator switches to the release state, and begins moving towards zero at the rate set by a fourth user control.

State Action
Idle do nothing (output remains at zero)
Attack continue increasing output at attack rate; if max is reached, change state to Decay
Decay continue reducing output at decay rate; if sustain level reached, change state to Sustain
Sustain do nothing (output remains at the sustain level)
Release continue reducing output at release rate; if zero is reached, change state to Idle

Note that the Idle and Sustain states have no exit rules. Moves out of these states are driven by a gate signal—driven by a synthesizer key press and release, typically, though the gate can be from a low frequency oscillator or step sequencer, for instance. Here are the additional state rules, driven by the gate transitions:

Gate transition Action
from “off” to “on” set state to Attack
from “on” to “off” if state is not Idle, then set state to Release

Curves

Some envelope generators—more often of the software variety than hardware—have attack, decay, and release segments that rise or fall in a straight line. But hardware envelope generators use exponential segments, typically. The exponential moves come from the charging and discharging of capacitors with current limited by resistors—one-pole filters, basically. Besides being easy to design with simple transistors, resistors, and capacitors, the exponential segments work especially well for amplitude envelopes, since we hear on a log scale (the exponential control against the log response makes the volume changes sound linear with respect to loudness).

Let’s take a look at how this works in our one-pole filter. You can view a one-pole filter as simply feeding back a percentage of the previous output, with the percentage value controlling the rate of the resulting exponential curve.

A simple AR envelope generator

To get the leanest view of how an envelope generator works, let’s start with the simplest. Synthesizers generate gate signals, typically, when a key is pressed—the gate starts at 0v with no keys down, jumps to 1v when a key is pressed, and back to 0v when the key is released. If we use that as the control signal to our VCA, we’d have an organ-like instant on and off—not very flexible.

But if we route that into a one-pole filter—a resistor feeding a capacitor to ground—the processed gate would ramp up and down with an exponentially-shaped curve, at a rate depending on the resistor and capacity values multiplied together. (Since it’s more difficult and costly to make variable capacitors, we use a fixed capacitor and a variable resistor.) Here’s a circuit simulation of this basic AR (Attack-Release) envelope generator:

one-pole ckt
one-pole ckt step

This generator is just as simple in software—a gate that toggles from 0 to 1, fed into a one-pole filter, which is a delay and a multiplier taking the place of a capacitor and resistor. Here’s an iterative simulation done in a spreadsheet, using a feedback value calculated to be equivalent to the capacitor and resistor values shown in the circuit:

AR calc

Next: ADSR

But the AR isn’t a very flexible envelope generator—it’s a simple shape, and a single control value sets both attack and decay times. It would be useful for flute-type sounds, but not for guitar, for instance. For guitar, we’d like a fast attack, a slow decay, and a quick release. For that kind of complexity, we need a more elaborate circuit, with transistors to switch in difference resistances and target levels for capacitor charging.

Next up, we’ll delve into the specifics of an ADSR envelope generator in software.

Posted in Envelope Generators, IIR Filters, Synthesizers | 1 Comment

Wavetable oscillator video

If you find this video helpful, let me know, and I’ll work more video demonstrations into articles.

Posted in Aliasing, Synthesizers, Video, Wavetable Oscillators | 6 Comments

About synthesizer control voltages

Since I write about simulating classic analog synthesizers (a process often called “virtual analog”), mostly notably in my wavetable oscillator series so far, I wanted to touch on the topic of parameter control signals. Classic synthesizers have knobs to set parameters, of course, but key parameters are controlled by voltage signals as well—primarily for oscillator and filter frequency, note volume, and perhaps oscillator pulse width.

Some more sophisticated and rare synthesizers—most often modular synthesizers—allow voltage control over such parameters as filter resonance, lag processing, and even envelope times. But the must-haves for synthesizers, certainly, are frequency and volume (pulse width is a distant third, but very useful and easy to implement, and important enough to make it requirement for most analog synthesizer designs).

Frequency and one volt per octave

Frequency perception by the ear is logarithmic—that is, note pitch is related to the log of frequency. As frequency increases in steps by a fixed amount, the difference in pitch becomes less with each step. To look at it another way, frequency must double for each increase of an octave in pitch. To make the relationship more manageable for synthesizers, an important de facto standard was settled early on—one volt per octave control. If an oscillator is set to 100 Hz, increasing the control input by one volt takes it to 200 Hz. Increasing another volt yields 400 Hz. Each of those steps is one octave in pitch—a doubling of frequency. If control of frequency were linear, the amount of additional voltage needed to move up an octave would depend on the current frequency. And the depth of low frequency modulation—vibrato, for instance—would depend on what note was being played. This arrangement would make the system much more difficult to manage. So, pitch-based synthesizer modules—voltage-controlled oscillators and filters—contain a linear-to-exponential converters.

Note that not all classic synthesizers used exponential control of frequency. But most did, including the most recognizable, and the ones we’ll use as a model. Further, the linear controlled (Hertz per volt) systems were most often performance synthesizers, where shortcomings could be more easily worked around in a limited architecture. But we’re taking a modular approach, where exponential control is the clear winner—the main difference is that we’ll use values instead of volts for control.

For us, exponential conversion is just a bit of math, so it can be a virtual module that we can attach to any linear input. But our basic target will be to use it where the classic designs use it.

Volume

Our volume perception is also logarithmic—loudness is related to the log of amplitude. We need to double the amplitude of a signal each time we want an increase of 6 dB in volume. So, you might think that your typical synthesizer would have developed with voltage-controller amplifiers—VCAs—that have exponential converters, and typical control sources would be linear. But while modular synthesizers often have the option of linear or exponential control, the most common synthesizer VCAs are linear. And the most common control source of an output VCA is an envelope generator with exponentially-shaped segments. An exponentially decaying voltage controlling a linear VCA gives a smooth, linear-sounding, fade out. One reason analog synthesizers evolved this way is that the shape of discharging capacitors is exponential—it’s fairly each to make exponential-shaped envelope generators from capacitors and resistors for the curves (our one-pole filters!), and transistors to switch between states.

Note that the ear is not as sensitive to volume changes as it is to frequency changes. It’s much harder to hear the asymmetry resulting from linear control with a low frequency modulation for volume that it is for pitch—just as it’s much easier to hear a singer who’s a little bit off-key in a mix than it is to to notice a vocal mixed a little too soft or loud.

Also note that even in modular synthesizers, where a VCA might have exponential control (my old Aries modular has a switch on the VCAs, to select linear or exponential control), the envelope generators typically remained exponential-only, so the VCA would be set to linear for the most common use of controlling the overall note envelope. Exponential control was especially useful for other VCA uses, such as processing control voltages.

Mixing linear and exponential control

We’ve established that, while both frequency and volume have similar needs for exponential control, typical analog synthesizers have exponential control inputs for oscillator and filter frequency, but not for VCA amplitude. VCA are most often driven by envelope generators, which have exponential segments in classic designs. We may also use low frequency modulation for the VCA (tremolo), but our ears aren’t terribly sensitive to the resulting asymmetry. But what about the exponential segments of the envelope generators? Won’t they cause problems when we often use the same envelope generators on the exponential inputs of the oscillators?

It’s true that we’ll get a double-exponential response. Instead of the pitch sliding down at a steady rate on the decay portion of the envelope, the move starts quickly but decelerates, sliding into the target pitch relatively slowly. Fortunately, this sounds good, and mimics certain aspects of physics that we’re used to—the landing of birds and airplanes, the slowing of an elevator reaching its target floor.

Next up

Why bring this up now? Because our next topic is envelope generators, and I wanted to establish why we’ll be looking at exponential segments. If you do a web search for envelope generator (and especially ADSR) images, you’ll see a large number of drawings with linear segments. But make no mistake—if you want to sound like a classic synthesizer, then you want exponential envelope segments.

Posted in Synthesizers | 1 Comment

Perspective on dither

Recently, I’ve had lengthy discussions on the topic of dither with a couple of different people—of opposite views. One believes that everything should be dithered, including truncations to 24-bit. The other feels that dither is a waste of time even for 16-bit, and is needed only for shorter word lengths, rarely used today.

This got me thinking about how to give perspective on the truncation distortion and dither levels we’re talking about for 16-bit and 24-bit files. The usual demonstration would be to find or create a recording that is not unrealistic, yet produces noticeable truncation distortion at the bit levels we’re interested in—which is mainly near the floor of 16-bit and 24-bit samples.

But it occurred to me that people don’t have a good idea of how loud the distortion levels are by themselves, and that should be the place to start. I thought of a way to generate signals of those levels with no distortion of their own.

Perfect audio

Are you ready to listen to perfect digital audio files? By perfect I mean that they are exactly what they are designed to be. If I’d tried for low-level sine waves, for instance, they would be crude, distorted approximations. Instead, I created square waves at precise bit levels to avoid quantization errors, synchronized to the sample rate to avoid aliasing. The sample values are exact—they are not values that are “close” to expected values—and the harmonic content is representative of truncation effects at those levels.

The signal begins with the LSB set (we’ll call this “+1”), then the next sample is negated (-1). This repeats (+1, -1, +1, -1, …) for a short duration, then the period increases by one sample (+1, +1, -1, -1, +1, +1, -1, -1, …) for a short duration, and the process continues, increasing the period by one sample each time. This is a classic “divide by n” oscillator. You’ll notice that the pitch resolution is very poor at the beginning, as the second tone is an octave down from the first (which is at Nyquist and will be swallowed by your reconstruction filter), but gets better and better as pitch drops and each additional sample is a small percentage of the period. The waveform amplitude is not the smallest possible—toggling between +1 and 0 would do that—but is representative of the amplitude of the smallest truncation effects for bipolar signals.

So, the sound starts dropping at discrete frequencies, and moves increasingly towards a smooth frequency sweep.

Listening tests

Listen to these files on a quality monitoring system. First, the generated test signal toggling at the level of the fifth bit, moderately loud, so that you can become familiar with what you’ll be listening for at the lower levels:

Sweep at the 5th bit level (-24.1 dB)

All files are 24-bit. The difference is that the +1 level for the 24-bit version is the 24th bit, for the 16-bit version it’s the 16th bit, and for the 5-bit version it’s the 5th bit (making it 2048 times, or 66 dB, louder than the 16-bit version).

Now for 16-bit. Start by setting your monitoring level to as loud as you would normally play music, by running a song through it. Crank it up, but don’t hurt your ears. Then play this file by itself:

Sweep at the 16th bit level (-90.3 dB)

On a quality system at a relatively high monitoring level, you’ll have no problem hearing this sweep. After listening a few times, you might want to play a song again to remind yourself of the relative level of the test signal. Think about how difficult it would be to hear the test signal during the chunking guitars of a rock tune, but how it might be heard in the fading of quiet piano notes in classical music. (Yes, truncation noise can resulting in more of a tearing sound that the tone of the test sweep, but the relative levels are still representative of the relationship.)

Now, 24-bit. Warning: Do NOT try to compensate by cranking your volume level. You are NOT going to hear this signal anyway, and you risk making a mistake and assaulting your ears horrifically, or blowing speakers. No matter how great you think your 24-bit converters are, it and all of your other gear generate higher levels of noise than the 24th-bit level.

Sweep at the 24th bit level (-138.5 dB)

If you want to try other levels, use a sample editor (such as the free Audacity, or your favorite DAW) to boost the level 2x, or 6.02 dB, for every additional bit that you want to move up.

Think about the levels, and at what bit levels you’d care to conceal truncation distortion with dither, and I’ll comment further in a future article.

Posted in Dither | 4 Comments

Replicating wavetables

My WaveTableOsc object uses a series of wavetables that are copies of single-cycle waveforms pre-rendered at a progressively smaller bandwidth. By “replicating wavetables” I mean taking a full-bandwidth single cycle waveform, and using it to make a complete set of progressively bandwidth-reduced wavetables. In this article, I’ll demonstrate principles and source code that will let you take any source waveform suitable for the lowest octave, and build the rest of the tables automatically.

The source waveform might be any time-domain single-cycle wave—something you drew, created with splines, recorded with a microphone, calculated—or it can be a frequency-domain description—harmonics and phases. Fundamentally, we’ll use a frequency-domain description—the array pair we use in an FFT—that holds amplitude and phase information for all possible harmonics. If we start with a time-domain wave instead, we simple convert it to frequency domain as the first step. The source code builds the bandwidth-reduced copies, converts them to the time domain, and adds them to the oscillator until it contains all necessary tables to cover the audio range without noticeable aliasing.

We’ll build a new table for each octave higher, by removing the upper half of the previous table’s harmonics. If it’s not clear why, consider that all harmonics get doubled in frequency when shifting pitch. A 100 Hz sawtooth wave has harmonics spaced 100 Hz apart; shifting it up an octave to 200 Hz shifts the spacing to 200 Hz apart, so half of the harmonics that were below Nyquist move above.

The code

In principle, the code is simple:

repeat while the wavetable has harmonic content:
	add wavetable to oscillator
	remove upper half of the harmonics

This is one of those places where I’d like to be more sophisticated, and use a faster “real” FFT and avoid the silliness of filling in a mirror-image array, but it’s more code and clutter—I’m sticking with the generic FFT from the WaveTableOsc source code. I’ve also reused the makeWaveTable function, so the new code here is minimal. The end product is WaveUtils.h and WaveUtils.cpp, plain C utility functions. I’d probably wrap the functions into a manager class that dealt with WaveTableOsc and wavetable classes, if I were focusing on a specific implementation and not an instructive article.

The two reused functions are used internally, and we need just one new function that we call directly—fillTables:

void fillTables(WaveTableOsc *osc, double *freqWaveRe, double *freqWaveIm, int numSamples) {
    int idx;
    
    // zero DC offset and Nyquist
    freqWaveRe[0] = freqWaveIm[0] = 0.0;
    freqWaveRe[numSamples >> 1] = freqWaveIm[numSamples >> 1] = 0.0;
    
    // determine maxHarmonic, the highest non-zero harmonic in the wave
    int maxHarmonic = numSamples >> 1;
    const double minVal = 0.000001; // -120 dB
    while ((fabs(freqWaveRe[maxHarmonic]) + fabs(freqWaveIm[maxHarmonic]) < minVal)
        && maxHarmonic) --maxHarmonic;

    // calculate topFreq for the initial wavetable
    // maximum non-aliasing playback rate is 1 / (2 * maxHarmonic), but we allow
    // aliasing up to the point where the aliased harmonic would meet the next
    // octave table, which is an additional 1/3
    double topFreq = 2.0 / 3.0 / maxHarmonic;
    
    // for subsquent tables, double topFreq and remove upper half of harmonics
    double *ar = new double [numSamples];
    double *ai = new double [numSamples];
    double scale = 0.0;
    while (maxHarmonic) {
        // fill the table in with the needed harmonics
        for (idx = 0; idx < numSamples; idx++)
            ar[idx] = ai[idx] = 0.0;
        for (idx = 1; idx <= maxHarmonic; idx++) {
            ar[idx] = freqWaveRe[idx];
            ai[idx] = freqWaveIm[idx];
            ar[numSamples - idx] = freqWaveRe[numSamples - idx];
            ai[numSamples - idx] = freqWaveIm[numSamples - idx];
        }
        
        // make the wavetable
        scale = makeWaveTable(osc, numSamples, ar, ai, scale, topFreq);

        // prepare for next table
        topFreq *= 2;
        maxHarmonic >>= 1;
    }
}

Defining an oscillator in the frequency domain

Here’s an example that creates and returns a sawtooth oscillator by specifying the frequency spectrum (note that we must include the mirror of the spectrum, since we’re using a complex FFT):

WaveTableOsc *sawOsc(void) {
    int tableLen = 2048;    // to give full bandwidth from 20 Hz
    int idx;
    double *freqWaveRe = new double [tableLen];
    double *freqWaveIm = new double [tableLen];
    
    // make a sawtooth
    for (idx = 0; idx < tableLen; idx++) {
        freqWaveIm[idx] = 0.0;
    }
    freqWaveRe[0] = freqWaveRe[tableLen >> 1] = 0.0;
    for (idx = 1; idx < (tableLen >> 1); idx++) {
        freqWaveRe[idx] = 1.0 / idx;                    // sawtooth spectrum
        freqWaveRe[tableLen - idx] = -freqWaveRe[idx];  // mirror
    }
    
    // build a wavetable oscillator
    WaveTableOsc *osc = new WaveTableOsc();
    fillTables(osc, freqWaveRe, freqWaveIm, tableLen);

    return osc;
}

Defining an oscillator in the time domain

I you have a cycle of a waveform in the time domain, you can create and oscillator this way—just do an FFT to convert to the frequency domain, and pass the result to fillTable to complete the oscillator:

WaveTableOsc *waveOsc(double *waveSamples, int tableLen) {
    int idx;
    double *freqWaveRe = new double [tableLen];
    double *freqWaveIm = new double [tableLen];
    
    // convert to frequency domain
    for (idx = 0; idx < tableLen; idx++) {
        freqWaveIm[idx] = waveSamples[idx];
        freqWaveRe[idx] = 0.0;
    }
    fft(tableLen, freqWaveRe, freqWaveIm);

    // build a wavetable oscillator
    WaveTableOsc *osc = new WaveTableOsc();
    fillTables(osc, freqWaveRe, freqWaveIm, tableLen);
    
    return osc;
}

Here’s a zip file containing the source code, including the two examples:

WaveUtils source code

Caveat

As with our wavetable oscillator, this source code requires wavetable arrays—whether time domain or frequency domain—of a length that is a power of 2. As examined in the WaveTableOsc articles, 2048 is an excellent choice. The source code does no checking to ensure powers of 2—it’s up to you.

Update: Be sure to check out the improved WaveUtils, in WaveUtils updated.

Posted in Oscillators, Source Code, Wavetable Oscillators | 53 Comments

About source code examples

My goal is to teach audio DSP principles in a way that is more intuitive than most available material. And part of that goal is to help you to think about the your goals and how to solve them by showing you my thought process.

So at one extreme, source code shouldn’t be part of what I’m doing here—ideas over implementations. However, examples are a powerful learning tool, and I want to show that the principles introduced here are practical. But the other extreme is to develop a full open-source library here, and that’s something different entirely. Foremost, it conflicts with what I’m trying to do here—even if I had time to do both. My goal with source code is to give you minimal, but commercial-grade examples in source code. Something lean enough for you to dissect and see the principles applied.

For example, the WaveTableOscillator object is a high-quality DSP component—good stuff. But it’s not exactly how I would write it for myself for a major project. In addition to the WaveTableOscillator class, which doesn’t create the wavetables themselves, I might write a wavetable class that does, and a manager class that would use the two classes, keeping track of oscillators and sharing common wavetables between them to save space and time. And I might use C++ templates. And asserts for trapping bad parameters at development time. But in the context of this blog, these things tend to obscure the functionality that I’m trying to focus on.

I’m getting ready to lay more source code on you in coming articles. I think that working C++ code, that you can experiment with, is better than using pseudo code, and I’ll continue in that direction for now.

Posted in Source Code | 2 Comments

Ventilator adapter in a mint tin

I don’t eat mints often, but when I do, they are sugar free…

A Leslie speaker simulation

Players of Hammond organs and their clones know how important the Leslie speaker is to the classic sound. Modern clones have electronic simulations built-in, with varying degrees of success. The Neo Instruments Ventilator is a particularly convincing DSP-based Leslie simulation in a foot pedal. I was never satisfied with the Leslie simulation in my Korg CX-3 (the modern DSP variety—they also made an electronic-component-based model of the same name circa 1980). Bypassing the internal Leslie simulation in favor of the Ventilator greatly improved my enjoyment of the instrument. (I have a Leslie 122, but it’s in a state of disrepair—besides, an electronic simulation makes it much quicker to record a sudden inspiration.)

But something’s missing

But the Ventilator has a serious omission: no MIDI control. A great advantage of Hammond clones over the real thing is that you can capture your performance via MIDI, then audition alternate drawbar settings and try different Leslie speeds later, in playback, without the need to repeat the performance. Without MIDI control, the Ventilator has no way to capture and play back the simulation of switching Leslie speeds.

To fix this shortcoming, I bought a MIDI-controlled relay (made by MIDI Solutions), as the Ventilator has an external footswitch input. But when I went to integrate it, I found what I deem a design failure in the Ventilator.

The (optional) external controller for the Ventilator has two footswitches, one to control fast or slow modulation, and the other for “brake” (stop). It connects with the Ventilator via a tip-ring-sleeve plug. I don’t use the brake function—I only want to switch fast/slow. but the way the Ventilator is designed, a single relay can’t do the job. To switch to fast speed, one pin goes high and the other low, and to go slow, the two pins switch to the opposites states. This would require two single relays configured to switch in opposite directions.

These MIDI relays aren’t cheap, so the thought of buying another was not a desirable choice. And it was irksome that Neo Instruments could have easily designed the Ventilator to allow a single relay to do the job with no change of other functionality. This would also allow a user to locate the Ventilator on top of the organ, for access to the knobs during performance, and use a simple and cheap single footswitch to switch speed instead of their relatively expensive external dual switch.

The truth

Here’s how they could have have done it (it’s a matter of software, so they could still do it): The tip and ring are each “pulled up” to +5 volts, individually, and are sensed by the processor so that each yield a logical “1” when disconnected, and a logical “0” when connected to ground (the sleeve). This give us two bits, or four possible states. Three states are used—brake, fast, and slow. Here’s the truth table (“0” indicates that the pin is connected to the sleeve—common—and “1” indicates no connection—open):

Tip Ring State
0 0 (undefined/unused)
0 1 fast
1 0 slow
1 1 brake

Note that there is one left over state—an undefined state (not possible to attain with their dual footswitch). However, if the Ventilator had been designed to use this state as a redundant speed selector, the device would work with either a single switch (SPST—single-pole single-throw), or their dual footswitch with no change.

The fix

OK, how do I fix this slight, and use my MIDI relay for speed switching? One choice would be to modify the MIDI relay. I popped it open, and found that the output jack is already a TRS connector, so I had all the pins I needed. I could try to find a small dual relay (SPDT—single-pole, double-throw), but even if I could, it’s possible there would be switching problems if I didn’t have enough current available (the MIDI relay box is powered from the MIDI current loop). While this would let the MIDI relay switch speeds, it wouldn’t let me use a simple single footswitch as an alternative, and I’d like that option. Alternatively, I could build a mod into the Ventilator—this is probably the easiest and most flexible choice, since I’d have access to the Ventilator’s power supply. After some thought, I decided that I’d prefer building an external adapter instead, leaving the Ventilator unaltered.

The biggest issue of an external adapter is that I didn’t want to use a battery. Since the Ventilator pulls the tip and ring pins up to +5 volts through resistors, internally, I needed a solution that tapped that current for power. Fortunately, the solution requires only a single transistor and resistor. When the input switch is closed, the NPN transistor switch opens, and when the switch opens, the NPN turns on and closes the connection.

Great, a small, simple circuit. But not small enough to fit in the shell of a connector, so I needed to put it in a box…

Building it

About the mints…Eclipse mints are sugar-free and come in these cool little metal boxes with a hinged, snap-shut lid. I’ve kept them instead of tossing them in the recycle bin—it seemed they could be handy for something. The only use I’d found so far is as a temporary holder of screws when I’m fixing gear, but I took a close look at one for this project. Yes, the dimensions are ideal for two 1/4″ phone jacks, with enough room left over for my circuit board and wiring inside.

Here’ s a circuit diagram—click to see a larger view, or go here to run a circuit simulation:

Adapter schematic

One transistor, one resistor, and a few jumpers on a prototyping board is all it takes. Trim the circuit board—easily under a half square inch. Drill a couple of 3/8″ holes for the jacks, and feed them through the open end with needle nose pliers. The metal’s thin, but once you tighten the nuts everything firms up again. Cost: $1.59 for the prototyping board, 69 cents per jack. I had the transistor (you can substitute other common small-signal NPN transistors) and resistor on hand, but you can find them for about $1.50 for packs of four or five.

Here’s a view before securing the board to the inside of the box with adhesive-backed foam tape:

Adapter, open view

The input connected to the relay or footswitch with a mono cable, and the output to the Ventilator with a stereo cable. I installed the output jack next to the twin mints of the box graphics as a reminder.

My CX-3 accepts a simple momentary footswitch to toggle Leslie simulation speeds, in turn outputting a MIDI control change that can toggle the MIDI relay. Since the MIDI relay box has a MIDI output that passes the MIDI notes and control changes, I can record notes and speed switching for playback later.

Posted in Uncategorized | Leave a comment

A one-pole filter

Here’s a very simple workhorse of DSP applications—the one-pole filter. By comparison, biquads implement two zeros and two poles. You can see that our one-pole simply discards the zeros (the feed-forward delay paths) and the second pole (feedback path):

We keep the input gain control, a0, in order to compensate for the gain of the feedback path due to b1, which sets the corner frequency of the filter. Specifically, b1 = -e-2πFc and a0 = 1 – |b1|. (In the implementation, we’ll roll the minus sign in the summation into b1 and make it positive, for convenience.)

Filter types

A pole is a single-frequency point of pushing gain up, and a zero is a single-frequency point of pulling the gain down; with a single pole, you are not going to get complex response curves such as bandpass, peak, and shelving filters that you can get with the two poles and zeros of a biquad. That leave us with lowpass and highpass, which have a single point of change.

However, a one-pole makes a poor highpass filter for cases in which we might be likely to use it—in particular, a DC blocker. That’s because it makes a highpass by pushing response up at high frequencies—we really need a zero to pull the response down at very low frequencies. So, we’ll only implement coefficient calculation for lowpass response here. However, we can still make a highpass filter suitable for a DC blocker by subtracting the output of a one-pole lowpass filter, set to a low frequency, from the direct signal.

Typical uses

What are typical uses of this one-pole lowpass filter? Sometimes, we do need the gentle slope of a first order frequency roll-off. And often, we use this simple filter on parameters instead of one the audio directly—some gentle smoothing of a calculated signal in an automatic gain control, for instance.

The 6 dB per octave slope of the one-pole filter—a halving of amplitude for every doubling of frequency—is gentle and natural. It’s extremely cheap, computationally, so it’s the perfect choice when you need just a bit of smoothing. And it doesn’t “ring” (overshoot), so it’s an excellent choice for filtering knob changes. Run each of your user interface sliders and knobs through a one-pole lowpass set to a low (sub-audio) frequency, and your glitchy, zippered control changes turn into smooth parameter changes.

Note that if you feed the one-pole lowpass an impulse, it will yield a perfect exponential decay—the same decay that we use on typical ADSR envelope generators on synthesizers. To look at it another way, the feedback path is an iterative solution to calculating an exponential curve.

The source code

The one-pole filter is so simple that we’ll make it inline, entirely in the header file:

//
//  OnePole.h
//

#ifndef OnePole_h
#define OnePole_h

#include <math.h>

class OnePole {
public:
    OnePole() {a0 = 1.0; b1 = 0.0; z1 = 0.0;};
    OnePole(double Fc) {z1 = 0.0; setFc(Fc);};
    ~OnePole();
    void setFc(double Fc);
    float process(float in);
    
protected:    
    double a0, b1, z1;
};

inline void OnePole::setFc(double Fc) {
    b1 = exp(-2.0 * M_PI * Fc);
    a0 = 1.0 - b1;
}

inline float OnePole::process(float in) {
    return z1 = in * a0 + z1 * b1;
}

#endif

Examples

Here’s a DC blocker:

OnePole *dcBlockerLp = new OnePole(10.0 / sampleRate);
...
// for each sample:
sample -= dcBlockerLp->process(sample);

Need highpass?

It’s trivial to add one-pole highpass response, if you’d like—just negate the lowpass calculation; this inverts the spectrum, so you’ll need to flip the frequency range by subtracting Fc from 0.5:

b1 = -exp(-2.0 * M_PI * (0.5 - Fc));
a0 = 1.0 + b1;

Pole position

You can experiment with how the location of the pole affects frequency response by using the pole-zero placement app. Click the Pair checkbox near the Poles sliders to disable it. Move both Poles sliders to center to move the poles to the origin. (The zeros are already at the origin by default when you open the page. At the origin, poles and zeros have no effect.) Now, move just one pole slider; you’ll see highpass response when the pole is to the left of the origin, and lowpass to the right.

Posted in DC Blocker, Digital Audio, IIR Filters, Source Code | 40 Comments

A note about de-normalization

One ugly thing that we need to be aware of, especially for filters, is the de-normalization of numbers. Basically, computer processors try to keep floating point numbers normalized—they try to keep them in the binary form 1.xxxxxx (where each x is 0 or 1) times 2 raised to a positive or negative power. When numbers become extremely small—smaller than can be expressed by the largest negative exponent—the processor tries to maintain precision by allowing the number to become de-normalized—to slip to 0.xxxxxx times 2 raised to the lowest exponent, in order to allow smaller but less precise numbers. The catch is that most processors become much slower when performing computations with de-normalized numbers (“denormals”).

Denormals happen when you multiply a small number by a tiny number. Typically, denormals are created in filters that ring out with no new input. Almost any new input, including background noise when recording, will keep denormals from appearing. However, if you process some sound, then follow it with samples of zero, a filter or reverb can decay to denormals; the denormals are then passed to the next processing block and cause it to slow as well.

Some modern processors have specific instructions to treat denormals as zero, solving our problem—if your code enforces that setting. You can repair a denormal by checking flags in a floating point number and substituting zero. But that’s a lot of work. A simple way to avoid denormals is to add a tiny but normalized number to calculations that are at risk—1E-18, for instance is -200 dB down from unity, and won’t affect audio paths, but when added to denormals will result in 1E-18; this won’t affect audible samples at all, due to the way floating point works.

The Pentium 4 in particular becomes devastatingly slow—so slow that it can’t keep up with real time audio processing. For audio purposes, denormals might as well be zero. They are so many dB down from unity that their only effect is the potential of bogging down our processing.

While denormal protection could be built into the filter code, it’s more economical to use denormal protection on a more global level, instead of in each module. As an example of one possible solution, you could add a tiny constant (1E-18) to every incoming sample at the beginning of your processing chain, switching the sign of the constant (to -1E-18) after every buffer you process to avoid the possibility of a highpass filter in the chain removing the constant offset.

The best solution depends on your exact circumstances, so you should make a point of understanding the threat. For instance, in the biquad code, there is a potential to generate denormals in the calculations of z1 and z2; if the filter input goes to zero, the output alone recirculates through z1 and z2, potentially getting closer and closer to zero until the values become de-normalized and continue to recirculate in slower de-normalized calculations, as well as passing to the next processing block and slowing its calculations.

Lest you think that you need only be concerned about denormals when fine-tuning your code, take a look at the performance hit for various processors (special thanks to RCL for permission to reproduce his chart):

Yes, that’s more than 100 times slower execution of test code on a Pentium 4 when the calculations become denormalized. I hit this one myself, years ago: After working on DSP chips, primarily, which are unaffected by this issue, I was developing a commercial product for Mac and PC that ran on the host CPU. I did the initial development on a Power PC processor, then gave it a run on my PC—a P4. Audio processing came to a dead stop. Fortunately, I was aware of the issue with denormals, and went straight to a solution that fixed it.

Please read RCL’s excellent article on denormals, “Denormal floats across architectures“, in which he goes into more detail, including comparisons of various CPU architectures.

Posted in Digital Audio | 1 Comment