I got fed up getting over 7 million page hits returned whilst doing Google searches for various musical resources, not to mention the numerous dead links and irrelevent sites.
So, I created a search engine targeted purely at musical resource sites and other sites that I think might be of interest. Sites are just currently being added to the index so this will grow considerably over the next few days. If you know of any sites that should be added (including your personal sites), please email me or click on the 'Add a site' link on the main page.
I SO agree with what you have done here. It has to be the most frustrating encounter every time I search on Google or other engines for that matter. My hat is off to you sir and yes I am positive this will be very useful.
Thanks chaps! It's early days and content is being added all the time. Just some points on how it works;
I regularly generate a report listing keyword searches which didn't return any hits so this lets me add any missing content. In other words, if you don't find what you're looking for, try again later. From time to time irrelevent sites will be added as part of the automatic indexing process but I'll delete these as soon as I spot them.
Great move, Tony! I am so weary of search engines that allow qualifiers but then ignore them. Get millions of hits, but not what you want, so you are advised to remove the qualifiers, thus getting a few million more hits that you don't need! Add to that all the gratuitous hits that are not in any way relevant to your search, and you wonder what the search engines are useful for!
A few hits here and there is no big thing, but when you start getting Google type volume you'll obviously have to spread it out across multiple servers.
Given that you do web stuff for a living, I'd be interested in knowing how you architected it for scalability. Although I do the web work for my company web sites, my primary expertise is in C++. I've always wondered how you guys set things up to start small but cope with growth to massive traffic. Especially in the wonderful world of stateless programming (don't get me started! ).
Christopher, I hadn't really planned to compete with Google so scalability would be limited initially to upgrading to faster hardware, more bandwidth, better use of caching etc.
Theoretical capacity as it stands enables the app to index up to 2 terrabytes of data but in real use I've tested this out to 2 million pages without much loss of speed, about 10Gb of data per million pages.
I'll be bringing a new faster server online soon so I'll get a chance to do some comparison testing.