Bandpass Filter Frequencies for Formant Synthesis

aeon · December 20, 2008

From Csound:

cheers,

Ian

asynchro_nous · December 20, 2008

Excellent!

Mike Conway · December 20, 2008

Thanks for that!

stikygum · December 20, 2008

That's a cool find. While we're on the topic, I've been looking to produce more vocaly/talking synth sounds. What's typically involved in doing programming a synth to achieve these sounds? Is it necessary to have at least 2 filters, one running into the next? I'm guessing the filters are the main compenent when making talking synth sounds. Are there any other parameters that play a key role?

aeon · December 20, 2008

You need to use at least 3 filters - 5 is kinda tweaky, but it does increase realism if you need to do true vocal sounds as opposed to airy, vocal pads.

In any case, you need to run those 3 (or 5) filters in parallel to one another, not in a serial chain. The only hardwired synth that I know of that has 3 filters in parallel is a Synton Syrinx.

Try a number of waveforms, including mixing in some noise - there are many colors to be found that are not natural, but beautiful.

cheers,

Ian

aeon · December 20, 2008

Oh - you could of course use a physical model of the vocal tract!

cheers,

Ian

Megakazbek · December 21, 2008

The only hardwired synth that I know of that has 3 filters in parallel is a Synton Syrinx.

Many synths are multitimbral or allow several simultaneous synthesis "layers" at once (Virus, Nord Lead, Radias, almost all workstations/romplers, etc). So, getting multiple parallel filters is not a big problem with many common synths.

idiotboy · December 21, 2008

Off the top of my head, four synths that propose to do "speech" synthesis:

1. Yamaha FS1r

2. Monomachine

3. Flame MIDI Talking Synth

4. Synton Syrinx

Of these, which actually create vocal sounds using formant synthesis? Dunno. Maybe only the FS1r ?

dereksljuka · December 21, 2008

Many synths are multitimbral or allow several simultaneous synthesis "layers" at once (Virus, Nord Lead, Radias, almost all workstations/romplers, etc). So, getting multiple parallel filters is not a big problem with many common synths.

But how many synths let you tweak the width of the pass band? You can sort of narrow it by turning up the resonance but that's not the same thing.

ElectricPuppy · December 21, 2008

Awesome info, thanks Aeon!

Tusks · December 21, 2008

I love the detail in this chart.

Thanks Aeon.

The Nord modular synths have a pretty decent vocal filter pre-packaged. Some romplers (3-4 tones in parallel through bpf's) can create ahs to oohs. Formants are good for vowels.

The consonants are a different matter ... much harder to achieve plosives sibilants, affricatives and morph between them. In a modular, you could use noise generators with different filtering and envelopes. I haven't seen a synth do that in pre-packaged form other than the FS1R. I'd like to see someone address the consonants in an easy to use way so I can do pseudo beat boxing. (Yeah, a synth imitating a guy imitating a TR808!)

Jerry

moondad · December 21, 2008

But how many synths let you tweak the width of the pass band? You can sort of narrow it by turning up the resonance but that's not the same thing.

On the Virus, you can set a HP and LP in series to create an 'expensive' BP easy enough and tweak the width with the Cutoff 2 controller. The Cutoff Link feature keeps it easy to handle, too.

aeon · December 21, 2008

Three different examples of using vocal physical modelling on the Korg OASYS PCI:

http://www.korg.com/audio/synths/call_to_prayer_(vocal).mp3

http://www.korg.com/audio/synths/modwheel_evil_(vocal).mp3

http://www.hvsynthdesign.com/audio/sing-it01.mp3

cheers,

Ian

soundwave106 · December 21, 2008

Off the top of my head, four synths that propose to do "speech" synthesis:

1. Yamaha FS1r

2. Monomachine

3. Flame MIDI Talking Synth

4. Synton Syrinx

One of FAW Circle's filters is a "mouth filter"... "ah", "ee", "oh", "uh", and "eu" are possible. Is it a formant? I don't know...

asynchro_nous · December 21, 2008

This sort of thing is precisely why I want to see greatly expanded filter "polyphony" in synths.

Roland surely could have multiplied by several times the number of COSM blocks available as an improvement in the GT over the previous V-Synths, for example.

Sad Darwin · October 15, 2009

Awesome thread is awesome

Thorhead · October 15, 2009

You need to use at least 3 filters - 5 is kinda tweaky, but it does increase realism if you need to do true vocal sounds as opposed to airy, vocal pads.

In any case, you need to run those 3 (or 5) filters in parallel to one another, not in a serial chain. The only hardwired synth that I know of that has 3 filters in parallel is a Synton Syrinx.

Try a number of waveforms, including mixing in some noise - there are many colors to be found that are not natural, but beautiful.

cheers,

Ian

Do you mean you could do this with a rompler by using 5 (same) saws and diffrent filter for each?

sad that with midi filters only have 127 hz ranges.

RichF · October 15, 2009

Off the top of my head, four synths that propose to do "speech" synthesis:

1. Yamaha FS1r

2. Monomachine

3. Flame MIDI Talking Synth

4. Synton Syrinx

Of these, which

actually

create vocal sounds using formant synthesis? Dunno. Maybe only the FS1r ?

Korg's own formant synthesis is available on the EMX-1, RADIAS, R3, and microKORG XL. When combined with Modulation Sequences (with all above except microKORG XL), you can form evolving formant shapes... In the case of the RADIAS, there's one preloaded Program that actually says the word "RADIAS."

mildbill · October 16, 2009

Do you mean you could do this with a rompler by using 5 (same) saws and diffrent filter for each?

....

Nice try - but no.

The problem is that you need to apply the filters to the same input (oscillators), so that you're you're filtering different areas of the same waveform.

If you use separate oscillators with their own individual filter, whatever's filtered out of one waveform will still be heard by the other oscillators that don't have those frequencies filtered out.

If you have really good ears, you might be able to hear a very weak effect. But you'd really have to keep track of where each filter was in each respective waveform and where they're moving to, if you wanted to try for a specific effect.

The Hamburglar · October 16, 2009

Wow thanks for sharing.

uvacom-rotatt · November 4, 2009

Nice try - but no.

The problem is that you need to apply the filters to the same input (oscillators), so that you're you're filtering different areas of the same waveform.

If you use separate oscillators with their own individual filter, whatever's filtered out of one waveform will still be heard by the other oscillators that don't have those frequencies filtered out.

If you have really good ears, you might be able to hear a

very

weak effect. But you'd really have to keep track of where each filter was in each respective waveform and where they're moving to, if you wanted to try for a specific effect.

Actually, since the filtering happens in parallel, there is no fundamental difference between five filters on one oscillator and five filters on 5 oscillators, provided those oscillators are generating the same waveform and synced for both pitch and phase (almost difficult to avoid on most pcm synths!). This is also a basic property of LTI (linear time invariant) systems.

And a cool side effect of this is that since the average digital filter in a pcm synth is LTI, you can build a sound using one pcm voice layered several times with different filter settings, sample it, change the waveform, sample it again, and mix the two - the result is exactly the same as if those two waveforms were mixed pre-filtering. Although tedious, that can be a powerful method for building complex sounds from synths with simple architectures.

psionic11 · November 7, 2009

Routing an OSC thru filters in serial gives different results than routing the same (pitch/phase) OSC(s) in parallel, correct?

And I'm guessing that why, in order to re-create vocal formants, you would use filters in parallel, not serially, is because a vocoder of X bands basically has multiple parallel filters acting in concert...

Although that would be simplifying it, wouldn't it? A vocoder's filters have their amplitude varied depending on the input (the speaking voice). To mimic a "talking" synth implies you are making the synth say more than just a single phoneme like "ah" or "oh."

Still, say I wanted to create a patch that sounds like a soprano "o" using the chart Aeon graciously posted. I guess I could use my Fusion's MIX mode to layer 5 different versions of that patch, each with its own (bandpass) filter settings @ 450, 800, 2830, 3800, and 4950, respectively. Would you then balance the volume outputs to match the chart's Amp(dB), using the 0 dB of patch/filter1 as a reference point?

But what's another way of stating the BW in Hz? Using resonance to simulate Q?

I think I've answered my own questions, will give it a try. Seems like any rompler or VA with a MIX mode where you can layer several instances of a patch/program together would be able to achieve the multiple parallel filter effect...

psionic11 · November 7, 2009

Wait a minute, I must have misunderstood what the Amp rows in the chart are supposed to indicate. Surely they don't refer to the volume output of that particular sound, at least not in dB!?

As I understand it, -6dB is half as loud as the reference level, and -12dB half again. Does that mean that -18dB is one eighth as loud, and -24dB equals 1/16th as loud as the original sound? If so, then values like -32, -40, -56 would be so imperceptible as to be insignificant. In other words, anything with a negative dB of greater than 30 dB (which would mean one-thirtysecond as loud as the original, super quiet).

Furthermore, since the values for the filters are all negative, wouldn't it be more appropriate to use bandcut rather than bandboost? Thanks for any input.

psionic11 · November 7, 2009

Got sidetracked and ended up making some growling leads, but I did give this a quick shot.

First off, bandpass gives a more convincing female 'o' sound to an otherwise normal sawtooth. Bandcut makes the sound buzzy.

I created 5 equal patches with a basic sawtooth, the only difference between patches being its cutoff frequency. Throwing them together in a layer, I then adjusted relative volumes of each. I gave the 'fundamental' frequency, f1 in the charts, full volume, and each subsequent patch reduced volume:

0 dB full volume

-6 dB as a 50% cut in volume,

-12 as half of that (25%),

-24 as half again (12.5%).

So, for the soprano 'o' I used 450, 800, 2830, 3800, and 4950 for the cutoff freqencies. Since the chart shows 0, -11, -22, -22, -50, which in multiples of 6dB as above is roughly 0, -12, -24, -24, -48, I therefore set up relative volumes @ 100%, 50%, 25%, 25%, and 13%.

The result definitely sounds like a female voice, but I can't convincingly say it is an 'o' being sung. Switching on/off individual patches in the mix this way reminded me of both an additive synthesis style along with flipping those switches on a church organ (toggling 16'/8'/4' pipes). They gave interesting sounds in various mixtures, but only all together did it sound most female.

It took more time to write this post than set up 2 MIXES using the above guidelines -- one MIX for bandpass, the other for bandcut, for comparison. I'm guessing strong improvements could be made with fine-tuning the relative volume levels, and also figuring out a way to simulate exact widths in Hz -- bandpass resonance at 100% = what?, narrowest Q?, but how many Hz wide is that?, maybe 50 Hz, depending on the cutoff frequency itself...?

Then the next challenge would be to modify those relative volume levels, and also change the base cutoff frequencies, in real-time using controllers like aftertouch, velocity, foot pedal, and mod wheel or knob. I'm sure if I stared long enough until my eyes glazed over, that in the chart you can begin to see a relationship between the "a" and the "o", and somehow map your controllers to morph from that "a" by having them all gradually reach the final parameters (filter cutoff, volume, Q/resonance) of the "o".

A lot of work when a vocoder is the best tool for this job. Still, it was because of this post that I finally learned that my Fusion's Vocal Formant 3 filter is an idealization of the human vocal tract, and that making a sound where aftertouch affects cutoff gives a unique, human-like wah wah, and hence the distraction into creating some very aggressive growling leads:cool:...

ChristianRock · November 7, 2009

Hey psionic! Welcome to KSS

It's good to see you around there parts.

Bandpass Filter Frequencies for Formant Synthesis

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived