After reading a few posts about various people's podcasting setups, I thought it was time to write a decent description of my podcasting setup and approach to producing a podcast from an audio standpoint.
Now, my setup is probably way more than most people need, but I'm a bit of an audio geek, and I write and record music with my setup also. Having said that, the components that I use to actually podcast are fairly cheap and can be acquired very easily.
You need a computer
First and foremost, you're going to need a computer. Mine is a homebuilt 3.4 GHZ Pentium 4 with 1 GB of RAM running Windows XP. The only special part of this machine as far as audio goes is the sound card. It's an E-MU 1820. This card allows low latency recording (via ASIO) and has built in dsp effects as well as a built in virtual mixer allowing you to route audio all over the place in many configurations. Back to that part in a while.
To actually record and mix, I use Steinberg's wonderful Cubase SX 3.0. This is a full recording solution that provides all the features you could possibly want, including great effect plugins, audio sample editing, looping, midi, etc... But for the purpose of producing a podcast it's there to record the audio, process it, and mix it with other content such as intro and outro music, promos, music played during the podcast and Skype calls.
Getting the audio into Cubase
I use a Rode NT1-A condenser microphone plugged into the front to the E-EMU 1820's breakout box (note that the 1820 is connected to it's PCI card in the PC via a single ethernet style cable - everything is digital between the PC and the breakout box). The 1820 provides phantom power to the microphone and has a great sounding pre-amp.
Now, lets look at the 1820's virtual mixer:
If you look at the first channel in the strip, you'll see that it represents the input for the microphone and that it has two "inserts". These inserts could be effects or special "sends" that create an audio channel back to the PC.
The first send injects the audio for the microphone to "Wave L/R - Host". This means that when using normal Windows applications that want to see a normal microphone input, they get the audio from my microphone. I use this to send audio to Skype.
The next insert is the key one for recording.
It creates an ASIO audio channel, sending the audio to ASIO channels 1/2 (i.e. left and right) in the PC and returns the audio back to the 1820 from the PC. What this means is that Cubase can pick up the microphone's audio and by activating "Direct Monitoring", the 1820 allows me to listen to my voice in my headphones as I record it with zero latency. This is important as I hate it when the sound of my voice in my ears is lagging behind speech.
So basically, I have dry, effectless audio heading into the PC and the 1820 routes a copy of it to it's outputs so I can listen to it with no latency. The last bit is that I like a bit of vocal juice on the headphones, even though it's not being recorded - it just "sounds better" than dry audio in your ears when recording.
So, if you look at the microphone's channel strip again, you'll see that the "aux 1" knob is turned up a bit. This means that it's sending a copy of the audio at a low level to the aux 1 bus.
Now, over on the right side of the virtual mixer in the "mastering section", you'll see a stereo reverb setup on the aux 1 bus. This reverb effect runs on the 1820, and so has no performance impact on the host PC.
So now I've got my dry audio from the microphone channel mixed with a little reverb, and it sounds good in my ears.
The rest of the 1820's I/O
Channel strip 2 is basically the same as channel 1, for a second microphone when I have a local guest on the show. The audio is sent to ASIO 3/4 so I can record it and treat it separately from my microphone.
Channel strip 3 is not really used for podcasting. One of the output busses from my physical mixer is connected to an input on the back of the 1820. This audio is send to another ASIO channel to be recorded in Cubase. I mainly use it record outboard equipment such as synthesizers.
Channel strip 4 is the output from Cubase. The main Cubase output is sent to ASIO channel 31/32 which the 1820 captures. It is sent to one of the outputs at the back of the 1820 and then fed into the physical mixer.
Channel strip 5 is the main Windows audio output. I.e. all normal windows audio such has beeps, alerts, Skype, iTunes, etc... The clever thing I do here is to route a copy of that audio back to the PC via an ASIO channel so that I can record it. This way I can record the Skype calls and have my voice and the other party's voice on separate tracks in Cubase. Note that there's a second send on that channel that routes the audio to a physical output on the back of the 1820 and from there into the physical mixer.
That's it really. You don't really need a physical mixer as the 1820 has a headphone output and a main mix output, so you can use headphones and connect up your normal speakers to it.
Now, on my physical mixer, the output from Cubase (via the 1820) appears on strip 11/12 and the main PC output appears on 13/14. The total mix from the 1820 appears on the tape input. This means that I can monitor mixes of all of these via physical sliders and send them to different physical audio outputs (i.e. headphones and speakers). I won't go into all this - it's a normal mixer, but the interesting thing I do is to send the control room output from the mixer to the headphone distribution amplifier in the rack.
This has four headphone outputs. I use one for my normal headphones when recording and mixing, the iPod earbuds plugged into it for a sanity check after mixing and there are spare outputs for when I have a local guest on the show.
The "Main Mix" output from the mixer is sent via the BBE Sonic Maximizer (just for when I listen to CDs, it is bypassed when doing production) to the Alesis Power Amplifier and finally to my main monitors - a pair of highly regarded, out of production Yamaha NS-10M Studio near field monitors.
Note that this is all overkill.
Back to actually recording
On track two you'll see my vocals, which I'll just duplicate and change the input to Mike B if I have a local guest. Track one contains the intro and outro music - you can also see some volume automation going on. This raises and lowers the volume of the intro/outro while I'm speaking and performs the fade in and fade out at the end of them.
Track three is where I place all other elements such as music, promos, etc...
If I'm doing a Skype call, or import some other element that requires some processing, I'll just create another track for it.
Now that I've got all the elements together, chopped them up and put them in the right place, it's time for some processing on the vocals.
Basically I have four "inserts" running on my vocals. Inserts are serial in nature - the audio is run through the first one, the output of which is run through the second one, etc...
First up is a gate.
This effectively stops audio from passing through if the volume level is too low (I have it set to -40 dB) and is great for removing background ambient noise during the silent bits when your microphone is very sensitive (e.g. a condenser microphone). Be wary though of using a gate when the level of ambient noise is quite high relative to your speaking voice. In this scenario, the gate will cut out the ambient noise during silent bits, but you'll still hear it when you're speaking. This can sound very odd, especially when the gate is cutting in during a bit of silence.
Next we have the DeEsser.
This reduces the sibilants in your speech - microphones pick these up really well and you'll want to drop them a bit.
Third is the compressor.
This levels out the overall volume level of your speech and depending on the compressor will inject some wonderful warmth into the audio. The aim here is to get a nice comfortable signal, with a consistent volume level as close to 0db as possible without clipping. You don't want to compress it "flat" either. This is a "feel" thing.
There are many types of compressor, but they basically do the same thing. I use the software version of the classic 1176 compressor. For more information about compression than I can ever give you here, check out the articles at Tweakheadz and Sound On Sound for great introductions to the subject.
Finally, we have EQ.
I don't like to EQ very much, but generally add a little bit of "air" around 14 KHz. I might carve the audio off with a soft high-pass at around 60 Hz depending on what my voice sounds like on the day. Remember to EQ after compression. You don't want to boost some frequencies only to have the compressor smack'em back down again. Please note that the compressor section of this plugin is bypassed.
So that's the dry audio processed. To juice it up a bit, I have a send running to an aux bus with a little subtle reverb. I really don't like hearing non-reverbed speech, so I add just a touch of reverb which is hardly noticeable and it really warms up the sound so that it feels like I'm in a real place - not a cardboard box.
Just don't go overboard and try to make me think you're podcasting from a Cathedral...
On the output bus of Cubase I add in a mastering effect. This effect (really a combination of effects - EQ, compression, maximizing, spatializing, etc...) applies to the whole mix.
Here I mainly add a little spatialization, and maximizing. The Maximizer aims to raise the volume level to a target, typically just short of 0 dB. I use -0.2 dB. The rest of the effects just add a little audio secret source.
Occasionally I'll compress the main mix too. But usually only if I have a guest on the show which will affect the overall level of the audio. When mixing real music, I will definitely apply some compression to the whole mix.
Just for grins, here are a set of audio files that take us through the differences when adding the effects:
- Dry audio - no effects
- ...add gate
- ...add de-esser
- ...add compressor
- ...add EQ
- ...add reverb
- ...add mastering effects
This Cubase configuration is saved as a template and used as the basis for every show. I'll still tweak the effects every time, as my voice changes every day depending on how I've been abusing my throat, lungs and vocal chords.
Anyhow, it's fun to tweak stuff. For me, the recording and tweaking is half the fun of putting together a podcast.
A word about plugins
Cubase as standard comes with a fine selection of software plugins - EQs, compressors, reverbs, etc... There are also many companies that sell other plugins, and up until very recently the only third party plugin I used was Izotope's Ozone.
Plugins are a balancing act though, as they take away precious CPU resources. If you add too many on too many tracks, the audio clicks, pops, slows down and generally becomes unusable.
There are many ways of getting around this such as freezing tracks that you're not editing "right now", on just switching some of them off for a bit. And that is exactly what I've been doing up until very recently. This has mainly been a problem when working on a real music project with many tracks (and running things like Reason in the background) and has rarely been a problem with podcasting.
My solution was to purchase a UAD-1 from Universal Audio. This is a PCI card that hosts the processing of plugins. The only part that lives on the host CPU is the UI. This works great for me and as reduced the plugin overhead to almost zero. The UAD-1 comes with a bunch of really good software versions of classic audio hardware such as the 1176 compressor. Highly recommended, but not really needed for your regular podcasting setup.
My setup is way overboard.
A cut down version of my setup with just Cubase, the E-MU 1820 and the Rode NT-1A microphone would work perfectly and give you great results. The rest of my gear (i.e. the mixer, headphone amp, power amp monitors, etc...) are really for convenience and because I'm an audio gear head.
Just remember to listen - both to your own podcast and other spoken word audio that you admire and then perfect your own sound.
Oh, and have fun!