A Bayesian view of Amazon resellers (2011)

[+] bbminner|2 years ago|reply

When I started learning about Bayesian statistics years ago, I was fascinated by the idea that a statistical procedure might take some data in a form like "94% positive out of 85,193 reviews, 98% positive out of 20,785 reviews, 99% positive out of 840 reviews" and give you an objective estimate of who is a more reliable seller. Unfortunately, over time, it become clear that a magic bullet does not exist, and in order for it to give you some estimate of who is a better seller, YOU have to provide it with a rule for how to discount positive reviews based on their count (in a form of a prior). And if you try to cheat by encoding "I don't really have a good idea of know how important the number of reviews is", the statistical procedure will (unsurprisingly) respond with "in that case, I don't know really how to re-rank them" :(

[+] MontyCarloHall|2 years ago|reply

With that many reviews, any reasonable prior would have an infinitesimally small effect on the posterior. Assuming the Bernoulli model in the blog post, the posterior on the fraction f of good reviews is proportional to

  p(f|good = 85k*0.94, bad = 85k*0.06) ∝ f^79900*(1 - f)^5100 * f^<p_good = prior number of good review> * (1 - f)^<p_bad = prior number of bad reviews>
  ∝ f^(79900 + p_good)*(1 - f)^(5100 + p_bad)

Recall that the prior parameters mean that the confidence of your prior knowledge of f is equivalent to having observed p_good positive reviews and p_bad negative reviews. So unless the prior parameters are unreasonably strong (>>1000), any choice of p_good and p_bad will have negligible effect on the posterior.

The main reason Bayesian statistics is not a magic bullet is because it's up to you to interpret the posterior distribution. What really does it mean that the fraction of positive reviews from seller A is greater than the fraction for seller B with probability 0.713? What if it were 0.64? 0.93? That's for you to decide.

[+] TeMPOraL|2 years ago|reply

And then there's real life: it's better to look at number of annulled negative reviews, if that stat is published. Positive reviews are gamed as standard practice. Negative reviews are bribed away. But the number of negatives removed is a good proxy for how bad the seller is.

[+] frakt0x90|2 years ago|reply

If anyone wants to get more into Bayesian stats, I will always recommend Statistical Rethinking by Richard McElreath. Maybe my all time favorite text book. He has an accompanying youtube lecture series as well.

[+] 0cf8612b2e1e|2 years ago|reply

I thought the book was so-so, but I find his lectures excellent. His second “season” of lectures has some amazing visualizations.

[+] Solvency|2 years ago|reply

What prerequisite in math do you need to understand it? Calc/linear algebra sufficient?

[+] akamoonknight|2 years ago|reply

Is there an adaptation of Bayesian statistics that also takes into account timeliness of the data ? e.g. a more recent string of negative reviews would potentially indicate something compared to a more smooth distribution

[+] MontyCarloHall|2 years ago|reply

Sure—you just use a likelihood function that doesn’t assume reviews are independently and identically distributed (where each review is an independent Bernoulli trial) but rather have some dependence on other reviews that are nearby in time. For example, a (Markov) switching autoregressive model, e.g. [0].

Everything else about Bayes’ rule (weighting likelihood by a prior distribution and re-normalizing to obtain a distribution over model parameters) applies just the same.

[0] https://www.statsmodels.org/dev/examples/notebooks/generated...

[+] wenc|2 years ago|reply

There two approaches:

1) A moving window: you only calculate your updates from the values in the window (say the last n reviews). The downside of this method is that older values drop off precipitously.

2) A forgetting factor: there are many possibilities but one simple one is the EWMA (exponentially weighted moving average). This is pretty standard, and takes the form

  Y_updated = alpha x Y_current + (1 - alpha) x Y_previous

with alpha in [0,1]. Applying this recursively, it "forgets" older values by weight alpha. This is also known as exponential smoothing in time series. The advantage of this method is that older values are simply weighed less and drop out more gradually.

[+] nuclearnice3|2 years ago|reply

Two thoughts.

1. There is a vast literature in "Bayesian Change Point Detection." So you might find a point where something changed and a string of negative reviews began.

https://dataorigami.net/Probabilistic-Programming-and-Bayesi...

2. Similarly, there a bajillion ways to weight recent data. One way is to increase the gain on a kalman filter. That will make recent observations more important. There are bayesian implementations of the Kalman filter.

http://stefanosnikolaidis.net/course-files/CS545/Lecture6.pd...

[+] inimino|2 years ago|reply

Bayesian analysis means you set up at least two models and evaluate the likelihood of the observed data under each. In this case rather than assume a fixed probability of good reviews, you might model a sudden switch (for example if an account was taken over or sold) or a gradual decline (for example if a manufacturer gradually lowers quality). Then you proceed as usual from that point. (I.e. find likelihoods and assign priors for each hypothesis/model, and use Bayes' rule.)

[+] jeffreyrogers|2 years ago|reply

I think you would change the prior in that situation under the assumption that it indicates some sort of underlying shift. I'm not sure that there is any theory guiding how you do that though.

[+] OscarCunningham|2 years ago|reply

(2011) The old days when Amazon reviews carried information.

[+] jldugger|2 years ago|reply

Yea, things get a bit more dire if you build an adversarial model where resellers are allowed to declare reputation bankruptcy.

[+] dataflow|2 years ago|reply

[+] 10000truths|2 years ago|reply

Even more relevant, Evan Miller has an article on this exact topic (using Bayesian statistics to calculate ratings) that goes into further detail than the original article:

https://www.evanmiller.org/bayesian-average-ratings.html

[+] KerrickStaley|2 years ago|reply

I like Evan’s 2009 post a lot, but I like John’s analysis here even better. John seems to make fewer assumptions; in particular, Evan assumes a 95% confidence bound.

[+] jtrip|2 years ago|reply

Can someone tell me the answer to the problem in plain english? I have no idea what a beta is.

[+] fumeux_fume|2 years ago|reply

It's a poorly written article that's almost completely useless for people actually interested in this type of problem. The Beta distribution models the uncertainty of a seller's positive ratio given their total count of positive reviews (a) and negative reviews (b).

[+] unknown|2 years ago|reply

[deleted]

[+] unknown|2 years ago|reply

[deleted]

[+] dwighttk|2 years ago|reply

hmm... I might start discounting sellers with too many reviews. Not sure where the cutoff would be and some sellers might actually be high volume and get lots of reviews, but a huge number of reviews makes me think they are fake.

[+] derbOac|2 years ago|reply

I think Amazon used to use the lower bound of a CI to sort? Or it used to be an option, then some sellers sued or threatened to based on the argument that it discriminated against smaller sellers?

[+] great_psy|2 years ago|reply

A lot of words to describe bayes rule

[+] ted_bunny|2 years ago|reply

Am I the only one who finds it intuitive and simple? Every article seems to give it a whole chapter of explanation, and vaunt it as the most groundbreaking concept ever. First I thought that was just Rationalists sniffing their farts, but I see it in a lot of places.

[+] DaiPlusPlus|2 years ago|reply

Sometimes that's necessary - a lot of Bayesian-stuff is very counter-intuitive.

[+] mathgeek|2 years ago|reply

Essentially this. There are many more variables involved in that specific example, such as time since last negative feedback, total age of account, feedback over the last 90 days, etc.

36 comments