Why not use logistic regression to estimate likelihood of signup rather than clustering the population into 5 groups? I'd imagine the resulting predictive model would not only have more direct methods for measuring prediction error, but also provide more business insight.
Absent underlying qualitative differences, there is rarely a good reason to break a continuous distribution into discrete groups for modeling purposes or for model performance evaluation[1].
But for human consumption (and I believe this article is an example of that), it can help. It's basically an ocular Hosmer-Lemeshow; not a rigorous or even consistent approach to model performance evaluation, but often interesting to those consuming the model's output. For example, we do it here to give students a sense of what their chances have meant historically: https://www.parchment.com/c/college/college-1404-University-...
[1] See the Hosmer-Lemeshow test, now uncommonly used.
Oh, I'm a big fan of logistic regressions. And this model actually has some logistic regression in the mix. But I just used this break down of 5 groups to easily report the performance of the model. I can also report a gini but it is way more visceral to see "this is who I predicted to be my top performing group, this is how big that group was, and this is how they actually performed."
In general your work may be complicated/sophisticated. But your results need to be simple.
Note that probabilities estimates from logistic regression are often not well calibrated. This typically doesn't matter if you're using logistic regression to create a classifier. However, if you are making cost-sensitive decisions based on probabilities predicted by logistic regression you make run into trouble. There are a number of publications on calibrating probabilities that suggest fixes.
All that said, I doubt it makes a difference in this case.
msellout|14 years ago
carbocation|14 years ago
But for human consumption (and I believe this article is an example of that), it can help. It's basically an ocular Hosmer-Lemeshow; not a rigorous or even consistent approach to model performance evaluation, but often interesting to those consuming the model's output. For example, we do it here to give students a sense of what their chances have meant historically: https://www.parchment.com/c/college/college-1404-University-...
[1] See the Hosmer-Lemeshow test, now uncommonly used.
snoble|14 years ago
In general your work may be complicated/sophisticated. But your results need to be simple.
noelwelsh|14 years ago
All that said, I doubt it makes a difference in this case.