I discovered Maarten, Thomas and Marc\'s Orchestral Sampling Manifesto last night by accident. What a discussion! Some very important topics. concepts, and questions raised.
As I don\'t do film scores, or symphonic-style work, I will have little practical use for samples created in the style envisioned by the \'Gang of Three.\' This in no way is to negate the work they are doing, (and certainly not the quality of their music!)
But it seems to me, as someone who has done a moderate amount of basically simple sample editing (recording, looping and truncating for archaic samplers) that there should be some way in contemporary technology (Giga et al.) to begin to approximate the \'all-purpose instrument set\'.
I\'m working from a limited scientific knowledge base, so this may be totally outside the realm of reasonability, but please consider the following:
Miles Davis is given a particular, specific, high-quality trumpet. He goes into a particular studio, stands in front of a specific mike an exact distance, location and direction and records a composition consisting of a series of chromatic scales played at varying dynamics and note lengths-in other words, all the notes needed to produce a full-on sample of this instrument, in his unique style. He may be requested to play this piece several times, thinking of it in varying emotional contexts (Don\'t nitpick this part, please, this is just the setup. Tomorrow he will sound somewhat different, but it will still be Mile\'s timbre, attack, swells, etc.-his \'signature.\'
Arturo Sandoval steps up to the mike and, using the same trumpet at the same volumes, records the same \'piece of music\', which captures his \'signature\'.
Next, Herb Alpert(!) does the same thing, followed by five other professionl trumpters, all with the same horn, same mike placement, etc. In other words, the only change is in the musician, NOT the technical part of the recording process.
The horn is the same, the process is the same, the \'composition\' is the same, but nobody in his right mind would say the recordings sound alike. And if I had said \'saxophone\' instead of trumpet, there would even be a greater distinction between artists.
How to determine the differences? What (technically) makes the difference between the performers?
It is possible to take the samples and analyze them to extract the essence of the horn, and generate data that describes:
the attack, the sustaining body of the horn as an average, the decay, at a variety of dynamic levels.
Attacks are subjscts for velocity-switching or key-switching, to determine mood.
Tails are useful to distinguish legato from detached note transitions, and to assist in articulation/phrasing. Frankly, Im not sure yet how much role they play in the process I\'m discussing except insofar as we see Chowning-style FM harmonic \'unwinding\' processes.
What I\'d like to focus on is the change of timbre due to volume, which is most relevant (I think) in the \'steady-state\' portion of the tone (Which, of course is anything BUT steady-state, that\'s what gives it the character!
The plan is to extract descriptions of deviation from pitch in the form of data of the scoops (if any,) frequency and short term amplitude modulation (vibrato and tremolo), and long term dynamic change (volume and timbre,) and to store these as data sets, which we use to create .
We have now \'captured their soul in a little black (or beige) box\' and can use this data to create articulation layers.
These can be applied as though we were using cross-faded sample sets, but by applying the results to create strictly a differential data set (only the added harmonics, pitch and volume changes, etc.) everything remains in phase and no apparent multiple intrument \'ghosting\' appears. It\'s like opening up the gain on a set of data that describes the additional spectra that appear as the result of the change of energy/groove/feel.
The key (I feel) is to be able to match the phases of the multiple sets, to avoid unwanted phase cancellation.
I don\'t know if it\'s possible, but it seems to me that it should not be that difficult provided the base sample set is sufficiently consistent (which, of course, has ALWAYS been the most difficult part!)
From here, we add the digital room of choice (concert hall, night club, etc.) as either another layer, or (probably) after mixing the \'orchestra\' into stereo/5.1/7.1 whatever.
NOT PART OF THE DISCUSSION, PLEASE:
1: Gratuitous flames-the point of this is to offer conceptual alternatives to the existing methods of sample creation. It is not to prove my mother was a poo-poo head. She was-and so am I. So there...
2: I don\'t believe we need the close-mike/far-mike discussion. My assumption for purpose of this discussion is moderately close-miked, as we want to add room of choice after the fact, not in the sample. If you don\'t believe this will work, that\'s OK, but let\'s not get into it here, that really is a different (and equally valid) discussion.
3: Even if this is the absolute best and ultimate method of sampling that ever was, is now, or ever shall be, I\'m not trying to invalidate other sampling methods. Conversly, even if this is stupidest, wrong-brained idea since bin Laden was a sad Saudi, what I\'d like is to find what part of it you think might work, and what part can\'t, and, if possible WHY (\"I think so,\" \"so\'s yer old man\" and \"because I SAID so\" and are probably not helpful.)
I don\'t expect to be able to actually do this in the immediate future; I\'m trying to create music, not instruments. I\'ve thought about this iff and on over the years, and as I now program computers in the daytime and have created programs to do similar tasks, I\'m hoping somebody who DOES create sample libraries can perhaps glean a kernel of \'aha\' out of some part of the idea, or its discussion.
I think your approach is right, but sampling hardware doesn\'t seem to have the grunt to do this in the way you envisage it - yet.
Right now we are managing to get 160 voices of samples streamed off hard disk in realtime, with a little simple modulation of volume, filters, layering, crossfading and switching. There\'s not even much room left over for any other tasks - hence a number of people advocating \'standalone\' Giga PCs.
What your system requires is at least that we are able to do realtime morphing of samples from one state to another. Not clever filtering of a bright sample to make a dull one, or cutting into the beginning of a slow attack in order to make a fast one.
As you said, even the sustain portion of a sample needs to have life, and this can\'t be a simple transposable crescendo/decrescendo. You need the sampler to be able to \'move\' between certain limits at the player\'s command.
I know programmers regard sticking a sample through a synthesis chain as serious realtime processing, but I\'ve always regarded that approach as pretty lightweight. What we\'re talking about here is far more intensive in terms of realtime computation.
I\'m sure it\'ll happen, I just don\'t know how long it\'ll take. Life is full of surprises though
hmmm... There\'s a better way of saying this. \'Morphing\' as a term has lost some of its meaning over the last couple of years, to the point where it could be replaced with the word \'mixing\' sometimes.
What current technology does is play back preset wave files, allowing the user to do subtractive synthesis with them, as well as mix them with other wav files.
What future technology needs to do is generate a completely new waveform in realtime, which is derived from several source waveforms which describe the extremes and central ranges of various parameters. In the case of trumpets, you\'d need to create the new wav from samples which have fast, medium and slow attacks, samples which have bright, middle and mellow timbre, samples which have fast, slow and no vibrato, samples which have a fast fall off, long fall off, no fall off, etc.,
Lots of processing compared to simply mixing, switching and filtering ready-made wavs...
[This message has been edited by Chadwick (edited 12-28-2001).]
I think what you\'re describing is somewhat like convolution; applying certain features from one signal unto another. This is already being done in reverb, with such units as the sony S777 and Altiverb.
Here, acoustic spaces are sampled and applied to a signal (a trumpet, for instance.
However, with reverb, it\'s not that difficult; emit a frequency sweep in a room, sample it, filter (extract) the frequency sweep from the sampled space and voila, a \'fingerprint\' from the room.
What you\'re suggesting is a bit more difficult; extracting the musical signature from an instrument and player. I think it is practically impossible to determine what the difference is between the performance and the sound of the instrument, since it would be very difficult to describe parametically.
Closest thing I can think of is to have a perfectly acoustic modelled trumpet, that uses breathcontrol, and \'sample\' as it were the playing signatures of the performers you noted, by having them blow into a breathcontroller and record the resulting data. You could then apply this data as \'excitation\' data to the acoustic model.
Breathcontrol being mididata, there\'s quite a bit of customising you could do to it too!
Chadwick and Joris, I realize my thinking on this topic is a little mushy, as I\'m just beginning to analyze the problem from this angle. This may sound more like VL-1-style modeling processes or FM processes instead of sample layering, but what I believe I\'m really talking about may be defined as \'additive spectral synthesis.\' And it really only addresses the steady-state components, and does not begin to take into account the set of attack signatures. If this were found to be a realistic concept, the attacks would probably be key-switch/velocity switch components.
I\'m proposing a layering process, wherein ultimately one would add the sound \'differential\' of a loud sample to the existing sound of an mp level sample (create two samples with total phase coherence, in itself a daunting task, and then \'subtracting\' the Fourier components of the soft sample from the loud one. The resulting \'loud\' sample would not sound at all like the instrument, but if added back in to the soft instrument timbre at the original volume, would re-introduce the spectral qualities necessary to recreate the louder sound. At the original dynamic level, one would exactly reproduce the loud sample. It is at the levels between the original signals that the question arises-would this be a realistic sound, or merely another strange sound that resolves only at or near the original dynamics?
It just seems to my \'intuitive\' mindset that this would be, if not perfect, closer than the current velocity-switch boundary problems we now have. In fact, I would take it one step further and suggest three sets of layers. The middle set is a mp or mf instrument sample, \'plain vanilla.\' The \'loud\' set would be the differential set to be faded in that would effectively add the \'missing\' harmonics to the mid set. The third set would be a differential set that was created from the mid set and a pp set, and added out of phase, to CANCEL the harmonics in the mid set, and add the extra breath, etc. characteristic of the softer performance.
This sounds far-fetched, but I really feel that if the properly-aligned samples were available, this could be doable.
Beyond the difficulties of actually getting the original samples, the underlying question is, if it were possible to get these two or three audio files, and the differential set created (that part actually is not especially difficult,) would a smooth transition between the sets generate an equivalent realistic \'morph\' between timbres with nothing more demanded from the sampler than the ability to play two layers?
If I get the time, I may try this with a simple wave set, just to see what could result.
Anyway, thanks for your thoughts. You both have good points, but I\'m not convinces that this is unworkable (although if it is, I\'d rather find out BEFORE I take the time to write the code to test it out!)
The idea of fading/morphing between different playing styles of players, is interesting, but I dont think it would sound all that good. Even if you could get them to morph, the sound of the intstrument would change mid \"lick\" and might sound pretty innatural. Even witht he same mic and plavement and instrument, the player would make a BIG difference in tone, not to mention timing nuances that would be plain weird. Also inherent pitch problems, would need to be fixed, that would negate the \"character\" of the sample. What you should look into is MIDI instruments like the wind controllers and MIDI sax and the like. This is the best way to approximate player nuances, then apply them to samples. What needs to be done is one needs to create a sample library that would allow those nuances to be recreated. Also you may wnat to look into and application called Digital Ear. Its pretty impressive for what it does, but dont expect any current sample libraries to sound \"realistic\" using the data it extracts. Simple Low Frequency oscilation, even varied to to mimic human created pitch and volume, will not cut it with the current crop. There is too much going on in the real world for filters and pitch bend and envelope to recreate.
Now, the idea of a type of additive synthesis is nice, but there are a few things to consider. The difference between piano and Mezzop-f/forte is not just a simple difference. Some harmonics will have to be removed as well not just adding. Adding an \"air\" track is also a great idea, but again air flow changes with pressure, so its not just a sense of volume (but something is better than nothing, and this can add a sense of realism that is unnatainable with available sample ilbraries). Still, each instrument will require its own type of programming and sound design. Consider the resonance of a horn, simply crossfading between dynamic levels wont recreate the \"movement\" of the lower mid frequencies, especially if the horn is recorded from a distance. This is the only instance I like close micing for. You dont have to deal with room resonance.
Simple waveforms are easier to do this with. A combination of physical modeling and samples as well as acoustic modeling is probably the way to go for acoustic instruments.
In the realm of sampling and crossfading its not as simple as one would expect. Which is why not many people have released a library with much in the realm of \"working\" expressive features (save GOS and VotA.....AO doesn\'t count because of the phasing), but its especially lacking in the solo instrument market. Due to the intamacy of solo instruments. Any developer will claim \"ca-ca sound\" (yes as in \"It sounds like ca-ca\") to the phasing induced by solo instrument crossfading. Tho I can usually live with it If I\'m layering into an ensemble. It does not sound good for slow expressive lines tho.
There are however methods to push the envelope in this area though, I\'ve even developed my own method that I\'m planning on trying to tweak out a bit for a guitar library I\'d like to release, and I know of one developer who is trying to do something beyond even what you\'ve suggested here, with a combination of methods from atleast one brilliant sound designer.
Its nice to see that atleast some people are thinking \"outside the box\", and the more people working torwards these ideas the more advances we might make. However if you create a morphing method while I\'m in the hardcore development stage of my guitar library... I\'ll kill ya heheeee
Really...I am an Idiot
[This message has been edited by KingIdiot (edited 12-30-2001).]
You make some good points, and sort of answer my question re is the concept useful. I know it would be really difficult(if not impossible) to get the initial \"middle of the road, plain vanilla\" samples, but \"I have a drum - er, dream.\" If what you say is true and some harmonics actually decrease as an instrument gets louder, then you\'re guitar library is safe. I won\'t steal it! (But if it\'s playable, I might buy it. I already have three Strat libraries, and they\'re all VERY different, and I\'ll use them for different types of projects.
I used to be an avid Computer Music Journal reader, and actually built their digital oscillator circuit. It worked great, but I had to design my own controller, because the one they incorporated wasn\'t designed for live performance. Shortly after I had it working, Yamaha released the DX7, and I just put it in the attic. I\'m a musician, not an instrument designer.
But technology is NOT for the timid, and I can\'t give up my dream of absolute, total control of ALL parameters at every step of the music production process, but I guess this one isn\'t going to fly. That\'s OK, because in the end, by the time I\'ve written software to extract the differential data, and sampled one horn player until his lips fall off, Giga 4.0 will be here, with some alternate tone generation process that will make all my work obsolete-again!
And the other point-I was talking with my drummer last night about the \'additive spectrum\' idea, and after about 30 seconds he just gave me a long look and said, \"I have NO idea what you\'re talking about.\" (He claims to be technically oriented, but the only way he can get his live mixer to work is to solo ALL the active channels! I\'ve tried several times to discuss it, but he says, \"I know what I\'m doing-this works for me...\") Which is to say, he got it working this way once, and is not willing to learn the \'proper\' way (he did the same thing with his previous console.)
BTW, you are 100% right about modeling, it is THE way to produce organic instruments. I have LOTS of fun with my VL70, even though I suck as a wind controller (maybe that\'s the problem, I\'m really supposed to blow...)
But building a home-made acoustic modeling system is not my idea of fun, and Yamaha always stops development when they realize the \"average\" musician doesn\'t want to have to learn anything! I\'ll bet given today\'s 2GHz processors and 233 MHz bus systems, Yamaha could produce a real, 16-note polyphonic, multitimbral modeling engine and throw it into a variety of controller formats. Even produce a wind controller with a built-in modeler, so you didn\'t need the separate box. But probably they wouldn\'t sell very many, because few instant-gratification-trained young musicians are willing to actually study the methods needed to program such a monster.
Don Buchla talked about building instruments good enough for a human being to dedicate his life to learning to play. And he has created some unique, powerful musical tools. But so few \'musicians\' have any interest beyond being the next Lynrd Skynrd or OUT\'A SYNC, there is little incentive to expand the control horizon.
I find it extremely gratifying that so many of the people on these various GS forums are young, eager musicians trying to write complex music. Symphonic film scores are not my field of expertise, but I can appreciate good work when I hear it. And the level of performance, craftsmanship, creativity and intellect needed to produce these works is awesome.
My hat\'s off to those GS users who work in this medium. Try not to let the glare from my bald head shine in your eyes.
Any while we\'re off the topic, let\'s look at some definitions here. If an unmusical, raucous noise is cacophony, then isn\'t a poorly programmed synth a cacaphone?
As for your drummer friend and the mixer. That still has me cracking up.
There is one MORE thing to remember about your additive apectral synthesis idea. I never said it wouldn\'t sound good For something like a drum it might work better than say for a sax or a stringed instrument. There is jsut more to making the sound in those instruments, which is why I believe some things disappear as they get louder, and its not just percieved volume/sounds that get hidden behind louder sounds. There are better ways to go about what you\'re looking for.
Have you looked into Tassman, its a PC based physical modelor. I want to like it, its jst that they haven\'t developed any good sounding acoustic instruments. I\'m not that technically savvy hen it comes to building mathematically correct patches I\'m ore of a tweak and see kind of guy. With Tassman I\'d get lost instantly, I need to study some more eloborate patches to get an idea of what goes on in it, but all the \"big\" patches I\'d seen are just big synth patches. Which makes it look much more like Reaktor (modular based synth app) than the Yamaha Physical Modelor.
I think the BEST models would come from something that will map sound through/from a virtual modeling process of an instrument, with 3d designed models (yes visual), with interaction of metals and air pressure to recreate resonance. This would allow for some creation of \"virtual\" instruments that dont even exist that react like what they would in the \"real world\". It seems more intuiitive than a modular based set up.
Yes I AM nuts-aren\'t we all? Grown men spending their lives huddled over little boxes of sand, farting into $1000 microphones???
I\'ve tried Tassman (the demo) and it didn\'t seem to me to be representative of the modeling process, it was closer to FM in its control. (Don\'t take this to the bank-it\'s been a while, and I just went to look for it to verify what I said, and I\'ve deleted it...)
But a proper physical modeler (PM) is definitely the middle piece of the equation.
Outside - controller(s): To reporduce what an acoustic instrument does in the hads of a reasonably talented player, we need some way to emulate the performance artifacts - increase bow or blow pressure, pluck or pick harder on a string (and where along the length we strike,) tonguing a mouthpiece, etc. No one keyboard-style controller is likely to be able to produce that many variations in a way to allow us to reproduce the physical-universe result. but we keep getting closer.
The middle - the instrument simulation: For me, this should not have to be defined as an existing instrument only, which is why I like PM. We can get good \'real\' instruments, hybrids like a bowed trumpet, or sax with a brass mouthpiece, and totally unique vibrating devices (no off-color remarks, please! )
The environment: No instument is played in a vacuum, and we don\'t have concerts in anechoic rooms, so we need to couple the instrument output to both its own internal resonance (which the PM may or may not do) and the outside world. Whatever excites the room, brings itself to life in the room, and I believe we\'re really close to making that believable. I used to use a Casio CZ-101 in my studio, even after it was totally outdated as a melodic voice, it still made great cheesy percussion. But when I put the cheesy percussion through a simple First Reflections algorithm on a digital reverb, suddenly that sound came to life. I couldn\'t tell what was being struck, but there was a \'real\' thing in a room, and it made music!
Agreed with you about controllers on the outside. Someone just needs to develope one.
There are people who are up to the challenge tho. Consider www.starrlabs.com and the Ztars. He actually build Custom MIDI instruments. There is also a MIDI violin maker....but they are more lke the Roland GR synths which use PItch to MIDI tracking. which is not the best. For a violin, I\'d like to see a pressure sensiticve bow, with two sets of infared beams above the strings, for position of both the bow and finger. Which will allow for better pitch detection (from infared) and the real feel from the strings.
For guitar it becomes more difficult, since the strings can be bent as part of the performance. I\'d lie to see a Ztar with \"rolling\" fret positions (side to side of the neck, not up and down the neck) that allow the player to mimic the movement of bending a string. This in turn could send pitch bend data.
these are jsut controller ideas, lieke you said you need to send this to something more than a \"sampler\". However I believe samples NEED to be the starting point and then PM used in conjuntcion with samples is how it should be done. True PM will always be like #D modeling IMO. It will never be \"right\".
Then of course, as we both stated, acoustic modeling.
I ahve a few ideas for NFX/DX/outboard box plugins/effects that could greatly enhance MIDI music. but I\'m not a coder. I\'ve been toying iwth the idea of talking to a few friends that develope audio plugins to see if we can create a few unique \"perfomance enhancer\" plug ins for MIDI music. there\'s definitely something that could be done even in the simplist of manners.