top | item 24841448

(no title)

sarah180 | 5 years ago

Spidering and building an index is relatively easy. It's not the barrier to creating a search engine, and I don't think you'd find that Microsoft's index is materially smaller than Google's. The hard part is figuring out how to turn that content into relevant results.

discuss

order

Grimm1|5 years ago

Only 2 companies in the US have an independent English based index with the contents of the entire web. Granted, the sheer volume of data is a barrier to making the index but removing that, only 4 US companies have crawled the entire internet. I'm going to have to disagree with you on that one. To write a crawler capable of the scale and timeliness to crawl the entire web in a week or two requires some pretty solid engineering. I don't however disagree that building a good search is also difficult.

sarah180|5 years ago

Google and Microsoft were not the first web spiders.

A famous example, Yahoo!, didn't walk away from search and partner with Microsoft because of the difficulty of building an index. They did it because it was going to cost billions per year to try to keep up with Google in producing results.

I'm not arguing there's no work to do in building an index, but the problems of crawling and indexing can be solved by cash. They're a moat against small challengers, but not against well capitalized ones. Ranking and filtering require lots of research and tuning. This is the moat against even the well capitalized.

Put another way: do you really disagree that Google would still easily dominate search based on result quality even if small startups got access to their index data?