When I started learning about Bayesian statistics years ago, I was fascinated by the idea that a statistical procedure might take some data in a form like "94% positive out of 85,193 reviews, 98% positive out of 20,785 reviews, 99% positive out of 840 reviews" and give you an objective estimate of who is a more reliable seller. Unfortunately, over time, it become clear that a magic bullet does not exist, and in order for it to give you some estimate of who is a better seller, YOU have to provide it with a rule for how to discount positive reviews based on their count (in a form of a prior). And if you try to cheat by encoding "I don't really have a good idea of know how important the number of reviews is", the statistical procedure will (unsurprisingly) respond with "in that case, I don't know really how to re-rank them" :(
With that many reviews, any reasonable prior would have an infinitesimally small effect on the posterior. Assuming the Bernoulli model in the blog post, the posterior on the fraction f of good reviews is proportional to
p(f|good = 85k*0.94, bad = 85k*0.06) ∝ f^79900*(1 - f)^5100 * f^<p_good = prior number of good review> * (1 - f)^<p_bad = prior number of bad reviews>
∝ f^(79900 + p_good)*(1 - f)^(5100 + p_bad)
Recall that the prior parameters mean that the confidence of your prior knowledge of f is equivalent to having observed p_good positive reviews and p_bad negative reviews. So unless the prior parameters are unreasonably strong (>>1000), any choice of p_good and p_bad will have negligible effect on the posterior.
The main reason Bayesian statistics is not a magic bullet is because it's up to you to interpret the posterior distribution. What really does it mean that the fraction of positive reviews from seller A is greater than the fraction for seller B with probability 0.713? What if it were 0.64? 0.93? That's for you to decide.
And then there's real life: it's better to look at number of annulled negative reviews, if that stat is published. Positive reviews are gamed as standard practice. Negative reviews are bribed away. But the number of negatives removed is a good proxy for how bad the seller is.
If anyone wants to get more into Bayesian stats, I will always recommend Statistical Rethinking by Richard McElreath. Maybe my all time favorite text book. He has an accompanying youtube lecture series as well.
Is there an adaptation of Bayesian statistics that also takes into account timeliness of the data ?
e.g. a more recent string of negative reviews would potentially indicate something compared to a more smooth distribution
Sure—you just use a likelihood function that doesn’t assume reviews are independently and identically distributed (where each review is an independent Bernoulli trial) but rather have some dependence on other reviews that are nearby in time. For example, a (Markov) switching autoregressive model, e.g. [0].
Everything else about Bayes’ rule (weighting likelihood by a prior distribution and re-normalizing to obtain a distribution over model parameters) applies just the same.
1) A moving window: you only calculate your updates from the values in the window (say the last n reviews). The downside of this method is that older values drop off precipitously.
2) A forgetting factor: there are many possibilities but one simple one is the EWMA (exponentially weighted moving average). This is pretty standard, and takes the form
Y_updated = alpha x Y_current + (1 - alpha) x Y_previous
with alpha in [0,1]. Applying this recursively, it "forgets" older values by weight alpha. This is also known as exponential smoothing in time series. The advantage of this method is that older values are simply weighed less and drop out more gradually.
1. There is a vast literature in "Bayesian Change Point Detection." So you might find a point where something changed and a string of negative reviews began.
2. Similarly, there a bajillion ways to weight recent data. One way is to increase the gain on a kalman filter. That will make recent observations more important. There are bayesian implementations of the Kalman filter.
Bayesian analysis means you set up at least two models and evaluate the likelihood of the observed data under each. In this case rather than assume a fixed probability of good reviews, you might model a sudden switch (for example if an account was taken over or sold) or a gradual decline (for example if a manufacturer gradually lowers quality). Then you proceed as usual from that point. (I.e. find likelihoods and assign priors for each hypothesis/model, and use Bayes' rule.)
I think you would change the prior in that situation under the assumption that it indicates some sort of underlying shift. I'm not sure that there is any theory guiding how you do that though.
Even more relevant, Evan Miller has an article on this exact topic (using Bayesian statistics to calculate ratings) that goes into further detail than the original article:
I like Evan’s 2009 post a lot, but I like John’s analysis here even better. John seems to make fewer assumptions; in particular, Evan assumes a 95% confidence bound.
It's a poorly written article that's almost completely useless for people actually interested in this type of problem. The Beta distribution models the uncertainty of a seller's positive ratio given their total count of positive reviews (a) and negative reviews (b).
hmm... I might start discounting sellers with too many reviews. Not sure where the cutoff would be and some sellers might actually be high volume and get lots of reviews, but a huge number of reviews makes me think they are fake.
I think Amazon used to use the lower bound of a CI to sort? Or it used to be an option, then some sellers sued or threatened to based on the argument that it discriminated against smaller sellers?
Am I the only one who finds it intuitive and simple? Every article seems to give it a whole chapter of explanation, and vaunt it as the most groundbreaking concept ever. First I thought that was just Rationalists sniffing their farts, but I see it in a lot of places.
Essentially this. There are many more variables involved in that specific example, such as time since last negative feedback, total age of account, feedback over the last 90 days, etc.
[+] [-] bbminner|2 years ago|reply
[+] [-] MontyCarloHall|2 years ago|reply
The main reason Bayesian statistics is not a magic bullet is because it's up to you to interpret the posterior distribution. What really does it mean that the fraction of positive reviews from seller A is greater than the fraction for seller B with probability 0.713? What if it were 0.64? 0.93? That's for you to decide.
[+] [-] TeMPOraL|2 years ago|reply
[+] [-] frakt0x90|2 years ago|reply
[+] [-] 0cf8612b2e1e|2 years ago|reply
[+] [-] Solvency|2 years ago|reply
[+] [-] akamoonknight|2 years ago|reply
[+] [-] MontyCarloHall|2 years ago|reply
Everything else about Bayes’ rule (weighting likelihood by a prior distribution and re-normalizing to obtain a distribution over model parameters) applies just the same.
[0] https://www.statsmodels.org/dev/examples/notebooks/generated...
[+] [-] wenc|2 years ago|reply
1) A moving window: you only calculate your updates from the values in the window (say the last n reviews). The downside of this method is that older values drop off precipitously.
2) A forgetting factor: there are many possibilities but one simple one is the EWMA (exponentially weighted moving average). This is pretty standard, and takes the form
with alpha in [0,1]. Applying this recursively, it "forgets" older values by weight alpha. This is also known as exponential smoothing in time series. The advantage of this method is that older values are simply weighed less and drop out more gradually.[+] [-] nuclearnice3|2 years ago|reply
1. There is a vast literature in "Bayesian Change Point Detection." So you might find a point where something changed and a string of negative reviews began.
https://dataorigami.net/Probabilistic-Programming-and-Bayesi...
2. Similarly, there a bajillion ways to weight recent data. One way is to increase the gain on a kalman filter. That will make recent observations more important. There are bayesian implementations of the Kalman filter.
http://stefanosnikolaidis.net/course-files/CS545/Lecture6.pd...
[+] [-] inimino|2 years ago|reply
[+] [-] jeffreyrogers|2 years ago|reply
[+] [-] OscarCunningham|2 years ago|reply
[+] [-] jldugger|2 years ago|reply
[+] [-] dataflow|2 years ago|reply
[+] [-] 10000truths|2 years ago|reply
https://www.evanmiller.org/bayesian-average-ratings.html
[+] [-] KerrickStaley|2 years ago|reply
[+] [-] jtrip|2 years ago|reply
[+] [-] fumeux_fume|2 years ago|reply
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] dwighttk|2 years ago|reply
[+] [-] derbOac|2 years ago|reply
[+] [-] great_psy|2 years ago|reply
[+] [-] ted_bunny|2 years ago|reply
[+] [-] DaiPlusPlus|2 years ago|reply
[+] [-] mathgeek|2 years ago|reply
[+] [-] DarkNova6|2 years ago|reply
[+] [-] jjtheblunt|2 years ago|reply
[+] [-] etienne2|2 years ago|reply
[deleted]
[+] [-] AliceAnd01|2 years ago|reply
[deleted]
[+] [-] HarrietPatel|2 years ago|reply
[deleted]