Having been through similar decision making processes myself (with the Google Custom Search API, the Google Translate API, etc), this is just as likely an abuse mitigation technique as it is a revenue generation opportunity.
Requiring even a modest at-cost fee for a web service does wonders to discourage all sorts of misuse, from wanton large-scale data mining, to blatant repacking and resale, to worse. (Heck, simply requiring a valid credit card alone helps.)
And sadly no, simply having low quotas for free access doesn't entirely suffice. If there's material value to be extracted from a free service, you'd be amazed at the lengths people will go through to create large numbers of low-volume scrapers. Most of these are obvious and easy to detect and defeat, but continually doing so adds up in cost, and it takes engineers away from providing better services to legitimate customers.
In short, most people on the outside don't appreciate just how difficult the handful of bad guys make it for companies to do something good for the other 99%. So I'm sympathetic to Microsoft here, I really am.
This. The best way to fight these types spammers and scrapers is through economics - provide the content at-cost and it no longer becomes cost-effective for pray-and-spray models.
Imagine what the world would be like if there was a cost-per-unit to sending email - our inboxes would be a much saner, friendlier place. Fewer marketing emails, virtually no spam (no longer economical).
This got me thinking...is it possible for Microsoft to secretly buy out DDG and leave them the fuck alone except where they need help(ie. access to Bing index)?
For all the money the Bing unit keeps pouring, I feel buying DDG and leaving them the heck alone can be a reasonable long-term bet with relatively little risk.
The challenge would be to keep the founders/team motivated. They could spin it off as a completely different company on IPO track and give significant equity. But then what if Google wants to buy them out? And of course, the Bing team may have a problem with MSFT creating internal competition though that may just push em to do better.
More likely if DuckDuckGo gets in an acquisition bidding war, I put my money on Gabe passing acquisition for raising a huge funding round that lets him take some money off the table.
With all the flaws(enough that I don't use it), it remains a rare search engine start-up that has its heart in the right place: to actually serve consumers versus build some technology or team and get acquired(looking at you, Powerset).
I think you're getting ahead of yourself here. DDG has 0.1% of all search traffic ( 30m out of 23b queries per month ). They get all of their 'relevancy' from bing, and blekko. The reason you like them is because they clean up these results, provide peripheral add-ons like user privacy, no ads, easier syntax for power users, and one-boxes which function more as a knowledge-base than a search engine. They don't crawl/index the web. or if they do, we haven't heard anything about it, or seen any different relevancy rankings from the BOSS api. So they are really a new face to already extant search engines.
You mention a 'bidding war'. Who would buy them? Any 'features' they provide can be copied by msft/google if they feel threatened. They don't have their own backend search technology so partners who might want to do search with them, ask.com, search.com, etc. have no reason to work with them as opposed to the BOSS api themselves or even Bing. I'm just as excited as you are about innovation in search and competitors to google, but DDG needs to be re-architected on the back-end before these sorts of pronouncements make sense.
Here is something ironic. I bet you many of DDG's users are hackers who use it for privacy reasons. The same hackers who rely on Google Analytics in their own websites. If DuckDuckGo grows to become more than a niche search engine, the same hackers who use it will have to reinvent Analytics somehow. This is reason #1 why DDG will stay small. Reason #2 is that if necessary, a DuckDuckGoogle can be created in an afternoon's worth of effort in Mountain View.
blekko has been running a crawl+index of several billion pages for 2 years now, so perhaps I can talk about this a little.
If you want access to a big crawl to grep through it for interesting data, then Common Crawl is awesome and inexpensive and I don't think you can get anything like it for the price, unless your query is simple enough to run as a blekko webgrep (https://blekko.com/webgrep).
If you want to build a search engine, Common Crawl isn't so useful. Search engines want _directed_ crawling of the pages that they think are good. Crawling is only a small fraction of the total work done in a search engine. Search engines generally aren't on AWS, because the right configuration of machine isn't rented by Amazon -- serving queries needs SSDs or more ram and less cpu than what Amazon offers. So, what Common Crawl offers a search engine is higher costs and mostly bad data.
I'm sure they still make heavy use of the Bing API, but have since expanded their range of sources to soften the blow somewhat. They're now getting results from Blekko (who run their own index), and are I'm sure they've been building out their own index. There's a full list of sources here - http://help.duckduckgo.com/customer/portal/articles/216399-s....
We also don't have any pricing details for higher volume usage of Bing, and DDG are in a much better position to negotiate a better deal these days.
This seems like an odd move. It's not like bing has any traction with developers at all. Wouldn't charging them make it even harder to gain traction? I am not familiar with thei API, what does it have over google that would make me pay for it?
Google doesn't have a search API (it was long ago deprecated https://developers.google.com/web-search/). So the API itself is the advantage over Google in this case.
In my experience with the Bing API in the last several months, I've found that you get what you pay for. Its performance has been inconsistent at best, to the point which I created the site http://isthebingapiworking.heroku.com/. The web search api frequently orders of magnitude fewer results than the actual website, a problem making a frequent theme in their developer forums http://www.bing.com/community/developer/f/12254.aspx.
By charging for their search API, I would just say that Microsoft is beginning to take their API seriously. It seems pretty clear that minimal resources, if any, were dedicated to the free version.
I'm bummed, because I've found relative success using their news search api (particularly for the article aggregation component of http://www.congressionalprimaries.org/), and now we'll have to look into alternatives, but if this means actually providing a decent product, I think this is a good move for Microsoft.
No surprise there. I wonder if all the faux search engines will have to start either crawling/indexing, or transition to Knowledge Engines ( I'm looking at you DDG ). Curious to see if this sparks people to start more search companies.
Actually the search API would be interesting for domain specific search. You can use the API to create a site to present result specific to MP3, for example, formatting the result with the MP3 attributes.
[+] [-] dewitt|14 years ago|reply
Requiring even a modest at-cost fee for a web service does wonders to discourage all sorts of misuse, from wanton large-scale data mining, to blatant repacking and resale, to worse. (Heck, simply requiring a valid credit card alone helps.)
And sadly no, simply having low quotas for free access doesn't entirely suffice. If there's material value to be extracted from a free service, you'd be amazed at the lengths people will go through to create large numbers of low-volume scrapers. Most of these are obvious and easy to detect and defeat, but continually doing so adds up in cost, and it takes engineers away from providing better services to legitimate customers.
In short, most people on the outside don't appreciate just how difficult the handful of bad guys make it for companies to do something good for the other 99%. So I'm sympathetic to Microsoft here, I really am.
[+] [-] Aaronontheweb|14 years ago|reply
Imagine what the world would be like if there was a cost-per-unit to sending email - our inboxes would be a much saner, friendlier place. Fewer marketing emails, virtually no spam (no longer economical).
[+] [-] gl0wa|14 years ago|reply
[+] [-] zaidf|14 years ago|reply
For all the money the Bing unit keeps pouring, I feel buying DDG and leaving them the heck alone can be a reasonable long-term bet with relatively little risk.
The challenge would be to keep the founders/team motivated. They could spin it off as a completely different company on IPO track and give significant equity. But then what if Google wants to buy them out? And of course, the Bing team may have a problem with MSFT creating internal competition though that may just push em to do better.
More likely if DuckDuckGo gets in an acquisition bidding war, I put my money on Gabe passing acquisition for raising a huge funding round that lets him take some money off the table.
With all the flaws(enough that I don't use it), it remains a rare search engine start-up that has its heart in the right place: to actually serve consumers versus build some technology or team and get acquired(looking at you, Powerset).
[+] [-] guimarin|14 years ago|reply
You mention a 'bidding war'. Who would buy them? Any 'features' they provide can be copied by msft/google if they feel threatened. They don't have their own backend search technology so partners who might want to do search with them, ask.com, search.com, etc. have no reason to work with them as opposed to the BOSS api themselves or even Bing. I'm just as excited as you are about innovation in search and competitors to google, but DDG needs to be re-architected on the back-end before these sorts of pronouncements make sense.
[+] [-] motti_s|14 years ago|reply
[+] [-] Aloisius|14 years ago|reply
[+] [-] greglindahl|14 years ago|reply
If you want access to a big crawl to grep through it for interesting data, then Common Crawl is awesome and inexpensive and I don't think you can get anything like it for the price, unless your query is simple enough to run as a blekko webgrep (https://blekko.com/webgrep).
If you want to build a search engine, Common Crawl isn't so useful. Search engines want _directed_ crawling of the pages that they think are good. Crawling is only a small fraction of the total work done in a search engine. Search engines generally aren't on AWS, because the right configuration of machine isn't rented by Amazon -- serving queries needs SSDs or more ram and less cpu than what Amazon offers. So, what Common Crawl offers a search engine is higher costs and mostly bad data.
[+] [-] CurtHagenlocher|14 years ago|reply
[+] [-] fungi|14 years ago|reply
love the idea but search results are balls (ATM)
[+] [-] rlpb|14 years ago|reply
[+] [-] notatoad|14 years ago|reply
[+] [-] tcwc|14 years ago|reply
I'm sure they still make heavy use of the Bing API, but have since expanded their range of sources to soften the blow somewhat. They're now getting results from Blekko (who run their own index), and are I'm sure they've been building out their own index. There's a full list of sources here - http://help.duckduckgo.com/customer/portal/articles/216399-s....
We also don't have any pricing details for higher volume usage of Bing, and DDG are in a much better position to negotiate a better deal these days.
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] mda|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] cannuk|14 years ago|reply
[+] [-] sycr|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] caryme|14 years ago|reply
By charging for their search API, I would just say that Microsoft is beginning to take their API seriously. It seems pretty clear that minimal resources, if any, were dedicated to the free version.
I'm bummed, because I've found relative success using their news search api (particularly for the article aggregation component of http://www.congressionalprimaries.org/), and now we'll have to look into alternatives, but if this means actually providing a decent product, I think this is a good move for Microsoft.
[+] [-] guimarin|14 years ago|reply
[+] [-] guard-of-terra|14 years ago|reply
[+] [-] ww520|14 years ago|reply
[+] [-] RyanMcGreal|14 years ago|reply
[+] [-] harryf|14 years ago|reply
[+] [-] zmonkeyz|14 years ago|reply
[+] [-] denzil_correa|14 years ago|reply
[+] [-] loverobots|14 years ago|reply