top | item 7400063

(no title)

erickerr | 12 years ago

Thanks a lot for the feedback. This is a first pass implementation, but I agree that more thought should be put into the cutoff threshold, specifically for when there are only initially 2 (or maybe 3) variations.

We considered a weighted decision approach but 1) were turned off by posts like http://visualwebsiteoptimizer.com/split-testing-blog/multi-a... and 2) wanted to keep moving parts to a minimum for V1.

Any thoughts?

discuss

dlss|12 years ago

Hey Eric,

It's a good v1 for sure, congrats!

I would ignore any non-baysian MAB posts out there. The formulation used by other approaches is one that considers an infinite number of repeated trials, which is basically an insane assumption. Epsilon greedy and UCB1 aren't optimal except with that assumption.

You should check out:

  - http://www.economics.uci.edu/~ivan/asmb.874.pdf
  - https://www.youtube.com/watch?v=vz3D36VXefI
  - http://www.cs.cmu.edu/~deepay/mywww/papers/nips08-mortal.pdf (good benchmarks)

+1 that VWO's blog post is dumb :p

FWIW you are doing a weighted decision approach, it's just that you've constrained your weights to be either 0 or 1...

Cheers,

David

Homunculiheaded|12 years ago

I would suggest that you put together a quick monte-carlo simulation for any of the models you're experimenting with to see how well they perform when you actually know the true conversion rates. There's plenty of theoretical issues you can find with any method and the more complex what you're doing is the harder it can be to work it all out with pencil and paper. Likewise, because you're dealing with probabilistic solutions, real-world results can be deceptive (for example conversion rates may naturally fluctuation between weeks or months). I've found that testing with simulations is the best way to get a real sense of how whatever method you wish to employ will work.

andrewryno|12 years ago

You can calculate the minimum sample size needed to come up with that threshold rather easily. See: http://vuurr.com/split-testing-determine-sample-size/

Homunculiheaded|12 years ago

Except in that example the author is choosing E based on the observed difference between two means (which is the exactly the unknown you're trying to determine, so it makes no sense to use it as a constant in a formula), rather than the threshold for the minimum distance you care about.

If you're going the classical statistics route the entire point is that you need to determine your sample size before you peek at the data. In that post you would need to replace E with a threshold of difference that you care about, then calculate n before you start the test and not look at the results until you had reached n observations.

erickerr|12 years ago

However with the Multi-armed bandit approach it does make sense to re-consider that since it's a continuous optimization problem and the overall average conversion rate would be higher.

idunning|12 years ago

That visualwebsiteoptimzer page is mathematically confused and the simulation is fatally flawed, I wish it'd disappear from the internet. Check the comments.