After reading a few posts about various people's podcasting setups,
I thought it was time to write a decent description of my podcasting
setup and approach to producing a podcast from an audio standpoint.
Now, my setup is probably way more than most people need, but I'm a
bit of an audio geek, and I write and record music with my setup
also. Having said that, the components that I use to actually podcast
are fairly cheap and can be acquired very easily.
You need a computer
First and foremost, you're going to need a computer. Mine is a
homebuilt 3.4 GHZ Pentium 4 with 1 GB of RAM running Windows XP. The
only special part of this machine as far as audio goes is the sound
card. It's an E-MU 1820. This card allows low latency recording (via
ASIO) and has built in dsp effects as well as a built in virtual mixer
allowing you to route audio all over the place in many
configurations. Back to that part in a while.
To actually record and mix, I use Steinberg's wonderful Cubase SX
3.0. This is a full recording solution that provides all the features
you could possibly want, including great effect plugins, audio sample
editing, looping, midi, etc... But for the purpose of producing a
podcast it's there to record the audio, process it, and mix it with
other content such as intro and outro music, promos, music played
during the podcast and Skype calls.
Getting the audio into Cubase
I use a Rode NT1-A condenser microphone plugged into the front to the
E-EMU 1820's breakout box (note that the 1820 is connected to it's PCI
card in the PC via a single ethernet style cable - everything is
digital between the PC and the breakout box). The 1820 provides
phantom power to the microphone and has a great sounding pre-amp.
Now, lets look at the 1820's virtual mixer:
If you look at the first channel in the strip, you'll see that it
represents the input for the microphone and that it has two
"inserts". These inserts could be effects or special "sends" that
create an audio channel back to the PC.
The first send injects the audio for the microphone to "Wave L/R -
Host". This means that when using normal Windows applications that
want to see a normal microphone input, they get the audio from my
microphone. I use this to send audio to Skype.
The next insert is the key one for recording.
It creates an ASIO audio channel, sending the audio to ASIO channels
1/2 (i.e. left and right) in the PC and returns the audio back to the
1820 from the PC. What this means is that Cubase can pick up the
microphone's audio and by activating "Direct Monitoring", the 1820
allows me to listen to my voice in my headphones as I record it
with zero latency. This is important as I hate it when the
sound of my voice in my ears is lagging behind speech.
So basically, I have dry, effectless audio heading into the PC and the
1820 routes a copy of it to it's outputs so I can listen to it with no
latency. The last bit is that I like a bit of vocal juice on the
headphones, even though it's not being recorded - it just "sounds
better" than dry audio in your ears when recording.
So, if you look at the microphone's channel strip again, you'll see
that the "aux 1" knob is turned up a bit. This means that it's
sending a copy of the audio at a low level to the aux 1 bus.
Now, over on the right side of the virtual mixer in the "mastering
section", you'll see a stereo reverb setup on the aux 1 bus. This reverb effect runs on the 1820, and so has no performance impact on the host PC.
So now I've got my dry audio from the microphone channel mixed with a
little reverb, and it sounds good in my ears.
The rest of the 1820's I/O
Channel strip 2 is basically the same as channel 1, for a second
microphone when I have a local guest on the show. The audio is sent
to ASIO 3/4 so I can record it and treat it separately from my
Channel strip 3 is not really used for podcasting. One of the output
busses from my physical mixer is connected to an input on the back of
the 1820. This audio is send to another ASIO channel to be recorded
in Cubase. I mainly use it record outboard equipment such as
Channel strip 4 is the output from Cubase. The main Cubase output is
sent to ASIO channel 31/32 which the 1820 captures. It is sent to one
of the outputs at the back of the 1820 and then fed into the physical
Channel strip 5 is the main Windows audio output. I.e. all normal
windows audio such has beeps, alerts, Skype, iTunes, etc... The
clever thing I do here is to route a copy of that audio back
to the PC via an ASIO channel so that I can record it. This way I can
record the Skype calls and have my voice and the other party's voice
on separate tracks in Cubase. Note that there's a second send on that
channel that routes the audio to a physical output on the back of the
1820 and from there into the physical mixer.
That's it really. You don't really need a physical mixer as the 1820
has a headphone output and a main mix output, so you can use
headphones and connect up your normal speakers to it.
Now, on my physical mixer, the output from Cubase (via the 1820)
appears on strip 11/12 and the main PC output appears on 13/14. The
total mix from the 1820 appears on the tape input. This means that I can
monitor mixes of all of these via physical sliders and send them to
different physical audio outputs (i.e. headphones and speakers). I
won't go into all this - it's a normal mixer, but the interesting
thing I do is to send the control room output from the mixer to the
headphone distribution amplifier in the rack.
This has four headphone outputs. I use one for my normal headphones
when recording and mixing, the iPod earbuds plugged into
it for a sanity check after mixing and there are spare outputs for
when I have a local guest on the show.
The "Main Mix" output from the mixer is sent via the BBE Sonic Maximizer
(just for when I listen to CDs, it is bypassed when doing production) to the Alesis Power Amplifier and finally to my
main monitors - a pair of highly regarded, out of production Yamaha NS-10M Studio near field monitors.
Note that this is all overkill.
Back to actually recording
On track two you'll see my vocals, which I'll just duplicate and
change the input to Mike B if I have a local guest. Track one
contains the intro and outro music - you can also see some volume
automation going on. This raises and lowers the volume of the
intro/outro while I'm speaking and performs the fade in and fade out
at the end of them.
Track three is where I place all other elements such as music, promos,
If I'm doing a Skype call, or import some other element that requires
some processing, I'll just create another track for it.
Now that I've got all the elements together, chopped them up and put
them in the right place, it's time for some processing on the vocals.
Basically I have four "inserts" running on my vocals. Inserts are
serial in nature - the audio is run through the first one, the output
of which is run through the second one, etc...
First up is a gate.
This effectively stops audio from passing through
if the volume level is too low (I have it set to -40 dB) and is great
for removing background ambient noise during the silent bits when your
microphone is very sensitive (e.g. a condenser microphone). Be wary
though of using a gate when the level of ambient noise is quite high
relative to your speaking voice. In this scenario, the gate
will cut out the ambient noise during silent bits, but you'll still
hear it when you're speaking. This can sound very odd, especially when
the gate is cutting in during a bit of silence.
Next we have the DeEsser.
This reduces the sibilants in your
speech - microphones pick these up really well and you'll want to drop
them a bit.
Third is the compressor.
This levels out the overall volume level of
your speech and depending on the compressor will inject some wonderful
warmth into the audio. The aim here is to get a nice comfortable
signal, with a consistent volume level as close to 0db as possible
without clipping. You don't want to compress it "flat" either. This
is a "feel" thing.
There are many types of compressor, but they basically do the same
thing. I use the software version of the classic 1176 compressor. For
more information about compression than I can ever give you here,
check out the articles at Tweakheadz and Sound On Sound for great introductions to the subject.
Finally, we have EQ.
I don't like to EQ very much, but generally add
a little bit of "air" around 14 KHz. I might carve the audio off with
a soft high-pass at around 60 Hz depending on what my voice sounds
like on the day. Remember to EQ after compression. You
don't want to boost some frequencies only to have the compressor
smack'em back down again. Please note that the compressor section of this plugin is bypassed.
So that's the dry audio processed. To juice it up a bit, I have a
send running to an aux bus with a little subtle reverb. I really
don't like hearing non-reverbed speech, so I add just a touch
of reverb which is hardly noticeable and it really warms up the sound so
that it feels like I'm in a real place - not a cardboard box.
Just don't go overboard and try to make me think you're podcasting
from a Cathedral...
On the output bus of Cubase I add in a mastering effect. This effect
(really a combination of effects - EQ, compression, maximizing,
spatializing, etc...) applies to the whole mix.
I use Izotope's fine Ozone mastering plugin for this.
Here I mainly add a little spatialization, and maximizing. The
Maximizer aims to raise the volume level to a target, typically just
short of 0 dB. I use -0.2 dB. The rest of the effects just add a
little audio secret source.
Occasionally I'll compress the main mix too. But usually only if
I have a guest on the show which will affect the overall level of the
audio. When mixing real music, I will definitely apply some
compression to the whole mix.
Just for grins, here are a set of audio files that take us through the differences when adding the effects:
This Cubase configuration is saved as a template and used as the basis
for every show. I'll still tweak the effects every time, as my voice
changes every day depending on how I've been abusing my throat, lungs
and vocal chords.
Anyhow, it's fun to tweak stuff. For me, the recording and tweaking
is half the fun of putting together a podcast.
A word about plugins
Cubase as standard comes with a fine selection of software plugins -
EQs, compressors, reverbs, etc... There are also many
companies that sell other plugins, and up until very recently the only
third party plugin I used was Izotope's Ozone.
Plugins are a balancing act though, as they take away precious CPU
resources. If you add too many on too many tracks, the audio clicks,
pops, slows down and generally becomes unusable.
There are many ways of getting around this such as freezing tracks
that you're not editing "right now", on just switching some of them
off for a bit. And that is exactly what I've been doing up until very
recently. This has mainly been a problem when working on a real music
project with many tracks (and running things like Reason in the
background) and has rarely been a problem with podcasting.
My solution was to purchase a UAD-1 from Universal Audio. This is a
PCI card that hosts the processing of plugins. The only part that
lives on the host CPU is the UI. This works great for me and as
reduced the plugin overhead to almost zero. The UAD-1 comes with a
bunch of really good software versions of classic audio hardware such
as the 1176 compressor. Highly recommended, but not really needed for
your regular podcasting setup.
My setup is way overboard.
A cut down version of my setup with just Cubase, the E-MU 1820 and the
Rode NT-1A microphone would work perfectly and give you great results.
The rest of my gear (i.e. the mixer, headphone amp, power amp
monitors, etc...) are really for convenience and because I'm an audio
Just remember to listen - both to your own podcast and other
spoken word audio that you admire and then perfect your own
Oh, and have fun!