Grub a Distributed Web Crawler a la SETI @ Home

| 0 Comments


On Friday at the Open Source conference, Jimmy Wales, founder of Wikipedia and Wikia, the open source search engine project, announced the release of an open-source Web crawling site called Grub. Grub crawls the web indexing pages for the Wikia search engine.

It's a clever idea building upon other distributed projects such as SETI @ Home. Crawling the web is costly so if you have thousands of clients doing it for you that will save you money and could make crawling cost effective. However I have to wonder what percentage of an actual crawl will be performed by Grub distributed clients. Also if my computer is contributing to this project, which although is open source, is still a for profit venture, shouldn't I profit from it as well?

Grub aims to compete with Google. If they can get enough computing power behind them they might be able to get an index as large as Google's and maybe even bigger but the key to getting and keeping market share will be the results returned. And that is all in the algorithms of the search engine.

A distributed web crawling client for Project Phoenix is something to consider and providing people a portion of the revenue stream could make it attractive to users.

Leave a comment - Sign in with SpaceRef, Google, Yahoo or OpenID accounts

Recent Blog Entries

Thoughts on Apple's iPad - Why it Will Succeed
I haven't used one and I can't buy one yet, as I'm Canada, but I do have some thoughts on…
Bigelow Space Station 1/30th Scale Model
I received two Bigelow Space Station models today. They are 1/30 scale model and include one B.A. Standard Module, two…
What if Twitter was Down for Several Days? Perhaps it's Time for a new Internet Protocol
Anil Dash has an opinion piece today on CNN which basically says don't let a service like Twitter or Facebook…
Using Social Media Tools Like Twitter to add Value to Advertisers Campaigns
SpaceRef has recently started using Twitter as an additional marketing tool as part of our advertisers campaigns. We don't spam…
Apple 12″ PowerBook G4 Meet Yellow Dog Linux
I hate it when a perfectly good computer just sits around doing nothing. In this case it's my old Apple…
New Media Hearings - CRTC Should Once Again Do Nothing
Ten years ago I testified at the Canadian Radio-television Telecommunications Commission's (CRTC) New Media hearings in Ottawa and argued that…