top | item 6477051

Challenging the Bing It On Challenge

108 points| msrpotus | 12 years ago |freakonomics.com

62 comments

order
[+] RyanZAG|12 years ago|reply
[+] tzs|12 years ago|reply
Link 1: complains of misleading screen comparison. Microsoft compared diagonal size, but due to aspect ratio differences the screen with the larger diagonal size actually has less area. This could plausibly be called a lie.

Link 2: complains that some of the things shown in the "To the Cloud" ads do not involve storage on a third party server, and says this means they have nothing to do with the cloud. Also complains that the things in the ad that do involve storage on a third party server have been done before, and so somehow it is wrong for Microsoft to say that they do them.

Link 3: Makes no specific claim of Microsoft lying. Just gives a transcript of The Guardian talking to a Microsoft person at CES, and asserts that it must be full of lies because it is from Microsoft. (Which is amusing, because Techrights is just rebranded Boycott Novell, probably the most lie filled tech-related site on the web).

Link 4: same claim as link #1.

Link 5: video does not play.

Link 6: Does not claim that Microsoft is lying. It is just reporting that they did an ad making fun of the competition.

Link 7: Getting an error now, so I can't verify, but I got to it earlier and on a first quick read it looked like it was just complaining about bashing a competitor.

Your links do not really do a good job of supporting your argument.

[+] tomp|12 years ago|reply
The article is good, but the math sucks. Statistically, 1000 elements is a large enough sample for binary studies, and is consistent with the number of people that are usually asked about their voting preferences on most polls. Also, the size of the sample does not depend on the size of the population you want to sample, provided that the population is large enough.

Now, whether the sample was actually representative of the population in question is another matter.

[+] justinsb|12 years ago|reply
I tried the BingItOn.com challenge: 5/5 for Google for me. But I saw that Microsoft was trying to push me towards "popular" queries (what Miley Cyrus said today about the government shutdown). So I think this is an issue of demographics and sampling: for mass-market 'leisure' queries I could easily see Bing beating Google if that's what Bing is focusing on.

Equally, using Mechanical Turk is probably a biased sample (although to their credit, the paper does examine the demographics in detail); mechanical turk workers are paid more the faster they can find results, i.e. they are top-result driven, whereas I would guess leisure queries are more focused on entertainment and finding something interesting in the first few results.

And (despite the fact that I went 100% Google), I figured I should probably be using something more tech-focused, based on my queries, like DuckDuckGo. From that point of view, I think this BingItOn challenge delivers on the goal of getting people to re-examine their preconceived notions of which search engine they should be using.

[+] mattwallaert|12 years ago|reply
That's a remarkably clear-headed analysis; thanks for that. We at Bing don't always get the courtesy of people actually thinking deeply about what we do. =]

(Note: I work for Bing.)

[+] pyrocat|12 years ago|reply
It's great that they did this study. I wish more people would publicly challenge many of the outlandish claims in advertising. But it seems rather hypocritical to knock Microsoft for a sample size of "nearly 1,000" when their own study "obtained 1,008 Bing It On challenge responses from the MTurk platform and narrowed our analysis to 985 respondents who submitted screen shots for 4925 searches."
[+] yid|12 years ago|reply
Not in this case, no. They were trying to replicate Microsoft's results, which entails using a similar sample size.
[+] neutralobserver|12 years ago|reply
Hmm, they used Mechanical Turk "workers" to this experiment, paying each worker 40 cents per answer, which netted them 400 responses. They upped the payout to $1 and hit 1000 soon enough. They mixed the two response sets in the final analysis.

I'm SURE that didn't skew the results at all.

They respond to this in the study, but it's going to take a lot to convince me that Mechanical Turk is a respectable source of subjects for consumer behavior research.

The real news here is that Bing and Google are virtually tied with respect to search quality. But that's been my experience based on using DuckDuckGo (which is basically a rebranded Bing afaik).

[+] ck2|12 years ago|reply
Website operators need to revolt against Microsoft and their bots.

We see nearly a thousand parallel connections from their bots, nearly continuously.

For only 18% of search use, they seem to consume several times more resources than Google.

And for some stupid reason their bing bot and msn bot do not share data but crawl the same pages independently.

(and yes we use crawl-delay directive, they ignore it)

[+] dangero|12 years ago|reply
I don't think even the Freakonomics followup is valid because there's a huge difference between telling someone to run some searches and actual search engine use.

Here's what I might type if someone asked me to try their search engine: "Miley Cyrus". The funny thing is that I'm not sure if I would like Google's "Miley Cyrus" results better than Bing's. I'm not sure I could tell the difference.

Here's the kind of stuff I actually need to work well during my normal day: "xcode 5 warning signature expiration does not match provisioning profile"

The second one isn't a real XCode warning, and that's kind of my point. I can't generate compiler warnings from my head to test Bing. I have to see them on my screen first spontaneously. This means if I'm running a Bing challenge I can't possibly simulate the real world unless I'm given my own machine and a few hours at my leisure to run into these types of things.

[+] VLM|12 years ago|reply
The most surprising thing is there might be any correlation between performance of top end search engines at such widely divergent areas of human experience, low quantity weird software errors and high quantity pointless hollywood blather. Certainly a junky demo search engine could be beaten all around by any top end search engine. But I'm surprised at the top end they are competitive in both areas.

It would be like finding out olympic class weight lifters tend to also be olympic class marathon runners. Or the best commuter vehicle and best 100 ton mining truck happen to be the same vehicle.

[+] GFischer|12 years ago|reply
I'm at work, and I wanted to try the challenge, but BingItOn redirects me to Bing.

Maybe it's not available outside the U.S.?

Bing's results are way less relevant to me than Google's, but Google does have the advantage of a lot more personal data, and a specific subdomain for my country (.uy) which often returns more relevant queries (OTOH for programming I often have to force the U.S. website).

That's the reason I don't use DuckDuckGo either, the results are vastly inferior for my use case (even though I want to like it).

[+] DanBC|12 years ago|reply
> When I looked into the claim a bit more, I was slightly annoyed to learn that the “nearly 2:1” claim is based on a study of just 1,000 participants. To be sure, I’ve often published studies with similarly small datasets, but it’s a little cheeky for Microsoft to base what might be a multi-million dollar advertising campaign on what I’m guessing is a low six-figure study.

He's right, but a sample of 1,000 is rigorous compared to most studies used in adverts. Give a product to 50 people and ask if they like it or not. That turns into 97% of people recommend our product!!

Since I live in a country with heavily regulated advertising I'm surprised that there isn't something to cover "sounding sciencey" or "sounding mathy". Certainly if the math is wrong you can get it changed.

[+] coldcode|12 years ago|reply
Sample size is immaterial unless you clearly can see and evaluate the criteria under which they were chosen. 8000 of 9000 dentists choose Brand X toothpaste means nothing if they asked 100,000 people and found 8,000 willing to say they liked Brand X. By itself the sample size that is indicated may have little to do with the actual test.
[+] rlu|12 years ago|reply
For what it's worth, I'm pretty sure that most of us/the HN crowd can't fairly do the bingiton challenge.

At least for me, it's really quite easy to tell them apart just based on subtleties. I.E. Not the quality of the content itself but just how its presented. And if there are ever images involved, it gets even easier because Google and Bing display them completely differently (and honestly I think Bing does a more elegant/"nicer" job of it)

[+] rlu|12 years ago|reply
Here's me "blindly" picking Bing 5 times: http://i.imgur.com/NAuhhhV.png

This is why I'm skeptical of anyone here saying "I got Bing 100%" or "I got Google 100%". It's extremely easy for us to game.

[+] sailfast|12 years ago|reply
While the results returned are of use, Bing It On removes any of the right-hand side Google content pulled from Wikipedia and other databases which differentiate Google results quite a bit more than the Bing results (especially for quick returns on top hits). Selecting only a portion of a page is not really a fair comparison. That said, the effort did get me to take a closer look at the two and evaluate which one I preferred. I just didn't pick their horse.
[+] mattwallaert|12 years ago|reply
It is getting increasingly hard to do an apples-to-apples comparison (social sidebar, snapshot, etc.). The best we can do is try to narrow down to just the web results, for the moment.

(Note: I work for Bing.)

[+] qwerta|12 years ago|reply
Marketing aside I find Bing results good. Google is kind of turning into singularity and tends to return only google (or US) centered results (for example youtube only videos). Bing seems to have better diversity.
[+] eitland|12 years ago|reply
The opposite is just as annoying if not more:

I search in English and google insists on translating my search terms to the local language and displaying local search results.

Again: idea for a startup: -like google but 5 years ago.

[+] Pxtl|12 years ago|reply
Bing's abysmal performance on WP7 has seriously hurt the brand for me. If you're going to bundle the search engine into a dedicated button on your phone, you should probably avoid making it awful.
[+] ableal|12 years ago|reply
I was curious about the test, but, for me, http://bingiton.com/ goes straight to plain http://www.bing.com . Geolocation, I suppose. I'm in Portugal.
[+] galapago|12 years ago|reply
Bing is working worse in terms of results outside the U.S. They don't care about it.
[+] phamilton|12 years ago|reply
Also important is the fact that BingItOn reports Google results different from what I get when I search the same term on Google.com. My guess is that my results get tailored to my profile on Google.com and BingItOn.com is untailored. The thing is, tailored results are very useful. When I search Ruby, I get much better results in my own Google.com than either set on BingItOn.com.
[+] BIair|12 years ago|reply
tl;dr if you take the Bing challenge and make Bing your default search, try other variations on Bing as you would have on Google.

I tried the Bing challenge, and changed my default search to Bing almost a year ago. According to Google I was performing 10k - 20k searches per month, so Bing Rewards was an appealing incentive. Over a years time, those are some decent rewards.

Bing results are very good. The notable exception I've noticed is for technical, geeky searches that are probably most likely to occur with this crowd.

The biggest problem is one of "branding" and confidence. At first I found myself searching Bing, and if I didn't find the result I wanted, I'd switch to Google. The results were often very similar. But instead of returning to Bing to perform another search, I'd do it on Google. As a Google user, when I didn't find the results I wanted, I'd try a new search... I didn't try Bing. When I noticed this and changed my behavior, I became more satisfied with Bing's results.

[+] mattwallaert|12 years ago|reply
Well noted! I actually wrote about this very psych effect on the Bing blog; if Google doesn't give you a good result, its your fault. If Bing doesn't, its Bing's fault. This is common in situations where we go in with a bias.

(Note: I work for Bing.)

[+] mVChr|12 years ago|reply
Also, how well can you tell which search engine you prefer based solely on the results page itself without investigating how well the content of the individual URLs suggested in the results page satisfy your query on a one by one basis?
[+] nostrademons|12 years ago|reply
When I've had my friends take the Bing It On challenge and polled them on the results, they usually prefer Google by about a 2:1 margin, pretty close to what Freakonomics found.

It's great advertising for Google.

[+] mattwallaert|12 years ago|reply
Are your friends people like you? Which engine do you like? Is it possible that maybe your friends aren't a representative sample of the average searcher?

(Note: I work for Bing.)

[+] 51Cards|12 years ago|reply
BingItOn.com down for certain users? Just tried to open it here and it just bounces me to Bing. I have tried it in the past so something has changed or they have narrowed the focus market.
[+] hayksaakian|12 years ago|reply
A study produced by a party with an interest in producing a particular result? It's more likely thank you think.
[+] RyanMcGreal|12 years ago|reply
When I go to bingiton.com it redirects to bing.com so I can't do the challenge.
[+] mattwallaert|12 years ago|reply
Are you within the US? Unfortunately, Bing It On is only available in the US and China at the moment.

(Note: I work for Bing.)