splitforce | 11 years ago | on: 13,402 unhappy users and no lessons learned–so we built this
splitforce's comments
splitforce | 11 years ago | on: Show HN: NomadList – The best cities to live and work remotely in
splitforce | 11 years ago | on: A Roadmap To Becoming An A/B Testing Expert
We’ve found that a successful approach to A/B testing is really dependent on the type of company you’re operating and product that you’re offering. Small cosmetic changes to the UI or copy often result in equally small changes to click-thru or conversion rates, and so these A/B tests require relatively greater levels of statistical power in order to achieve significance.
For mega-traffic companies like Google or Amazon, these kinds of tests are worth the cost of testing because a sub-1% lift still contributes substantially to their bottom line.
But for everyone else, ‘shallow’ A/B tests of a button color or call to action will often yield inconclusive results. Here’s an article from the founder of GrooveHQ detailing such an experience: http://www.groovehq.com/blog/failed-ab-tests.
If you’re running a small or medium business – or even a larger one that does not have the scalable testing practices of a tech giant like Amazon in place – testing deeper changes to the product, UI layouts or entire UX workflows are what move the needle. This is what we’re now calling ‘empathic A/B testing’ – where tests are designed with empathy for users.
If you ask the questions: What changes can I make to my product or website that would motivate my users to take the actions I want them to take? What are they looking for? What do they care about? And why? More often than not, I think you’ll find that the answer is not ‘a different button color’
In the end, A/B testing is really a very unsophisticated way of answering the question ‘What works better?’ because you are sending a fixed proportion of your users to a suboptimal variant for the duration of the test. We’ve done a lot of research into better solutions to this problem, and have found that a dynamic approach using a learning algorithm always leads to faster results and higher average conversion rates. You can read more about that here: http://splitforce.com/resources/auto-optimization/
splitforce | 11 years ago | on: 6 A/B Tests That Did Absolutely Nothing for Us
What I’ve found is that a successful approach to A/B testing is really dependent on the type of company you’re operating and product that you’re offering. Small cosmetic changes to the UI or copy often result in equally small changes to click-thru or conversion rates, and so these A/B tests require relatively greater levels of statistical power in order to achieve significance.
For mega-traffic companies like Google or Amazon, these kinds of tests are worth it because a sub-1% lift still contributes substantially to their bottom line. They also have the traffic numbers to properly power tests of smaller changes in a reasonable amount of time.
But for everyone else, ‘shallow’ A/B tests of a button color or call to action will often yield inconclusive results because they don’t the traffic numbers. For these types of companies, we’ve seen that deeper changes to the product, UI layouts or entire UX workflows are what move the needle. Designing these tests require more thought and development work up-front – but at least you’ll be making substantial improvements in an experimentally rigorous way instead of just spinning your wheels with some one-off design tweaks.
To avoid these kinds of disappointing tests, another thing to consider is setting a minimum detectable effect. The idea here is that validating a small change in improvement requires more statistical power (i.e.: more test subjects) than validating a large change, and at some point in order to justify a continuation of the test you’ll want to achieve some minimum amount of lift. Once you can say with statistical confidence that this desired lift isn’t achievable, you can stop the test earlier and move on to the next.
Most importantly, you should be designing these tests with empathy for your audience. Ask the questions: What changes can I make to my product or website that would motivate my users to take the actions I want to them to take? What are they looking for? What do they care about? And why? More often than not, I think you’ll find that the answer is not ‘a different button color’ :-D
In the end, A/B testing is really a very unsophisticated way of answering the question ‘What works better?’ We’ve done a lot of research into better solutions to this problem, and have found that an automated approach using a learning algorithm almost always leads to faster results and higher average conversion rates. You can read more about that here: http://splitforce.com/resources/auto-optimization/
splitforce | 11 years ago | on: Wingify Launches A New A/B Testing Platform For People Without Coding Experience
One thing that I've noticed is that traditional A/B testing is a pretty sub-optimal way of answering the question: 'What works better, A or B?'
In the most basic example of an A/B test, you have a variation A and a variation B each shown to 50% of your user base. By definition, this approach will be sending half of your users to a worse performing version during the entire duration of the test!
The automated approach is based on a bandit algorithm that dynamically updates the proportion of users shown a given variation. With each new piece of data that you collect on the test variations' conversion rates and confidence, the algorithm adjusts the percentages automatically so that better performing variations are promoted and worse performers are pruned away.
This leads to:
1) faster results, because your directing test resources (i.e.: users and their data) to validate what you actually care about (i.e.: confidence in the best variation’s performance)
2) a higher average conversion rate during the test itself, because relatively more users are being sent to the better performing variation automatically, and
3) less time and effort required to actively manage your experiments.
Though the math behind this approach is slightly more complex than a traditional A/B test, it’s a no-brainer for those that are really interested in making data-driven decisions because of how much better the results are that it produces.
For anyone interested, here’s a post we put together on how it works: http://splitforce.com/resources/auto-optimization/
splitforce | 11 years ago | on: Bandit Algorithms for Recommendation Systems
Have you thought about how to deal with changes in environmental factors over relatively longer periods of time? For example, seasonality or changes in popular taste.
splitforce | 11 years ago | on: Show HN: Bandit Algorithm for iOS, Android and Unity
splitforce | 11 years ago | on: Show HN: Bandit Algorithm for iOS, Android and Unity
splitforce | 11 years ago | on: Show HN: Bandit Algorithm for iOS, Android and Unity
splitforce | 11 years ago | on: Show HN: Bandit Algorithm for iOS, Android and Unity
splitforce | 12 years ago | on: On Game Development
A lot of people forget how important two particular industries have been in terms of pushing the envelope when it comes to computer processing and Internet bandwidth technology: Pornography and gaming.
While porn and games are certainly among the more hedonistic (and certainly less virtuous) of products, because people care so much about them is in large part the reason why we have more powerful CPU/GPUs - for example - or faster connection speeds. (I guess you can thank U.S. military investments for some of this stuff as well.)
My point is, gaming is important. Like, really important. It might not have the direct impact on African schoolchildren that Kiva or Doctors Without Borders does, but one could argue that those organizations would not be able to leverage the technology that they rely on so much if others hand't paved the way. Keep doin' the good work, son! ;-)
One way we've found to limit this risk is by supplementing hypotheses developed through qualitative research with a quantitative approach.
So, ask users what they like/dislike about an experience to formulate an idea of what changes to your app may better the experience, BUT make sure to then TEST those changes using a rigorous method (i.e.: experimentation or A/B testing) to validate that the feedback you're hearing is not just noise...