I was looking at the Antares web site at their microphone modeller, and it appears that this lets you not only define your output-modelled microphone characteristics, it also allows you to remove the characteristics of your input-modelled microphone...
So that got me to thinking...We all go to great lengths to manage the space in which we record things like vocals in order not to let the surroundings unduly affect the sound...
But what if there were an "Anti-convolver"...effectively, would it be possible to make little or no effort to control the space these things are recorded in (except of course to make sure that the signal was noise free)...As an extreme example then, one would then be able to record a vocal in a bathroom, get an impulse afterward of the room, anti-deconvolve (err is that even a word?...it is now!) the signal and effectively have a pristine vocal with the room characteristics removed...
Of course I may well be behind the times and this may even exist (and be in general use for all I know), or it might be possible using some funky phase-reversing re-convolving who-knows-what...
Does anyone who understands the math of convolution think this is possible easily?
We have indeed tried something like you mentioned in a reverberant space.
We wanted to test whether we can create an anechoic experience in a huge cathedral, originally a quite reverberant space.
Initially, we had an anechoic recording that was coming out of the speakers and we captured that onto the hard drive. The result, of course was a very wet recording. Next, we captured the impulse response in the same setup with the same equipment under the same conditions and deconvolved the wet recording with the IR and played back the deconvolved sound.
I was quite skeptical and thought that we will not have the anechoic experience, but at least it sounded much closer to what I originally expected. It was far from being acceptable in quality, but impressive. (In between the audio, it sounded less echoic, but the reverb at the end was there of course.) This raw method of deconvolving is just an approximation of the solution to the problem because of various reasons. For example, our ears 'hear' differently than the microphone capsules and they are 'placed' differently (and almost everything else is different), but it is only one reason that simple deconvolution is not working perfectly.
I am not sure that it would work in an acceptably high fidelity even if we have the impulse response of the original recording equipment placed in the room. Technically, I imagine that it would be quite hard to capture the 'right' impulse response for deconvolving the audio. Just imagine that you would need a speaker with the same characteristics as the sound source, which you have to measure first as you do not know it initially, and then you have to design a speaker that produces the same sound radiation characteristics (if that is possible at all).
Maybe in the future, when we will use real ears as microphones and replace the speakers with something more perfect, these things would work better, but the convolution technique itself might not be enough to generate high quality results as there are several non-linear effects behind and convolution only works (as intended) in linear time-invariant (LTI) systems.
Hey thanks Csaba - Interesting that someone has actually tried it!
I was wondering about the mechanism one would need to capture the impulse, and figured a speaker setup at the same position relative to the mic as the vocalist would be as the most obvious solution, but as you say, it all gets more complex from there.
Thanks for the reply though - I guess I'll put my plans for the anti-convolver back in the box for the moment ...and with Tascam beating me to the 'Gigaclean' name I guess I'm back to square one...
for voice (speech) dereverberation, there are some other methods as well besides capturing the impulse response of the room and deconvolving with it; in fact, there are a few methods that do not need the impuse at all to remove the reverb (looks like magic), but as far as I know, none of these methods can generate pristine quality results yet. And it is still speech, not music.
This is an interesting idea. I'm a computer graphics guy, so I tend to look at this in terms of image processing. Adding reverb is a bit like blurring an image, and trying to anti-convolve it a bit like adding a sharpen filter. In general sharpening filters do not work that well, and they add boundary contrast artifacts. In general information is lost with the blurring that cannot be retrieved.
However with a reverberant sound if one has more information about the sound wavefront perhaps it would be possible to retreive much of the original wave. If one recorded with 3 mics instead of stereo for both the impulse and the performance, then the incoming direction of the sound in 3d could be determined. One could attempt to cancel out sound from directions other than the performer( I suppose this would only simulate a highly directional mic).
Or perhaps a ball of many outward pointing directional mics could more fully capture the wavefront, and thus provide enough info to accurately deconvolve the sound using an impulse.