Several times every week we get asked the same question. It comes in several variations & here are some quotes from our support E-mails,
Maybe these people have seen the current share-market valuation of Google and want some of the action or maybe they think anyone with a old 386 PC and a dial up modem can make a new Google or maybe these people are doing more dreaming than thinking.
So let me state up front. The entire internet will not fit on your hard disk. No, really, I mean it, not even on that new 400GB hard drive that you have.
So let me do some quick calculations for you,
Google indexes about 3,000,000,000 web sites.
Lets say there is an average of 100 pages per site and an average of 40KB per page (PDF & HTML files).
This equals storage requirements of 10,200,000,000,000,000 bytes. Which equals 12000 Terabytes of data. (or 12 Petabytes of data).
Another way to look at it is that you are going to need about 20,000 - 50,000 PC style computers linked together with smart software. Which, unsurprisingly, is about what Google is using.
Now this is a lot of data! We can reduce it by being smart with compression etc, but whichever you look at it, it is still a lot of data.
Storage requirements are also only the tip of the iceberg, you need a warehouse to install your 50,000 PCs and some very serious power and cooling infrastructure. Not to mention the need for a massive connection to the internet backbone to index the entire internet.
Now we would LOVE to build a solution like Google for someone. And for this reason we have spent a fair amount of time talking to people about what would be involved over the last few years.
But unfortunately most people are living in a total fantasy land. Which means we waste a lot of time trying to talk some sense into these people.
So we are happy to talk to people about indexing the entire internet, but it would be a very very serious undertaking.
So if you have a correspondingly serious budget and at least a tentative grasp of reality, please get in contact . Otherwise keep on dreaming.
-------
David
Wrensoft Web Development
Can Zoom scale to be the size of Google?
Can I index the all the internet using Zoom?
I have a list of 30,000,000 web sites, can Zoom index them?
I want to build a search engine like Yahoo, can you give me step by step instructions?
Can I index an infinite number of pages?
Can your search script index the entire web?
Can I index the all the internet using Zoom?
I have a list of 30,000,000 web sites, can Zoom index them?
I want to build a search engine like Yahoo, can you give me step by step instructions?
Can I index an infinite number of pages?
Can your search script index the entire web?
Maybe these people have seen the current share-market valuation of Google and want some of the action or maybe they think anyone with a old 386 PC and a dial up modem can make a new Google or maybe these people are doing more dreaming than thinking.
So let me state up front. The entire internet will not fit on your hard disk. No, really, I mean it, not even on that new 400GB hard drive that you have.
So let me do some quick calculations for you,
Google indexes about 3,000,000,000 web sites.
Lets say there is an average of 100 pages per site and an average of 40KB per page (PDF & HTML files).
This equals storage requirements of 10,200,000,000,000,000 bytes. Which equals 12000 Terabytes of data. (or 12 Petabytes of data).
Another way to look at it is that you are going to need about 20,000 - 50,000 PC style computers linked together with smart software. Which, unsurprisingly, is about what Google is using.
Now this is a lot of data! We can reduce it by being smart with compression etc, but whichever you look at it, it is still a lot of data.
Storage requirements are also only the tip of the iceberg, you need a warehouse to install your 50,000 PCs and some very serious power and cooling infrastructure. Not to mention the need for a massive connection to the internet backbone to index the entire internet.
Now we would LOVE to build a solution like Google for someone. And for this reason we have spent a fair amount of time talking to people about what would be involved over the last few years.
But unfortunately most people are living in a total fantasy land. Which means we waste a lot of time trying to talk some sense into these people.
So we are happy to talk to people about indexing the entire internet, but it would be a very very serious undertaking.
So if you have a correspondingly serious budget and at least a tentative grasp of reality, please get in contact . Otherwise keep on dreaming.
-------
David
Wrensoft Web Development
Comment