(no title)
e10v_me | 1 year ago
Most probably, in your case, higher sensitivity (or power) comes at the cost of higher type I error rate. And this might be fine. Sometimes making more changes and faster is more important than false positives. In this case, you can just use a higher p-value threshold in the NHST framework.
You might argue that the discrete type I error does not concern you. And that the potential loss in metric value is what matters. This might be true in your setting. But in real life scenarios, in most cases, there are additional costs that are not taken into account in the proposed solution: increased complexity, more time spent on development, implementation, and maintenance.
I suggest reading this old post by David Robinson: https://varianceexplained.org/r/bayesian-ab-testing/
While the approach might fit in your setting, I don't believe most of other users of tea-tasting would benefit from it. For the moment, I must decline your kind contribution.
But you still can use tea-tasting and perform the calculations described in the whitepaper. See the guide on how to define a custom metric with a statistical test of your choice: https://tea-tasting.e10v.me/custom-metrics/
No comments yet.