Thanks for this writeup. Whenever people complain about some service removing or making it harder to try out a free tier, I think they don't realize the amount of abuse that needs to be managed by the service providers.
"Why do things suck?" Because parasites ruined it for the rest of us.
> We have to accept a certain amount of abuse. It is a far better use of our time to use it improving Geocodio for legitimate users rather than trying to squash everyone who might create a handful of accounts
Reminds me of Patrick McKenzie's "The optimal amount of fraud is non-zero" [1] (wrt banking systems)
Also, your abuse-scoring system sounds a bit like Bayesian spam filtering, where you have a bunch of signals (Disposable Email, IP from Risky Source, Rate of signup...) that you correlate, no?
Co-Founder of Geocodio here who designed the scoring system :)
I suppose you could call it inspired by Bayesian inference since we're using multiple pieces of independent evidence to calculate a score, though that makes it sound a bit fancier than it is and we aren't using the Bayes' theorem. But it's possible I had that in the back of my head from a game theory class I took long ago.
But for the fun of it, let's model it that way:
Probability (Spam | disposable email domain, IP address, etc... ) = [probability(disposable email domain, IP address, etc... | spam) x prior probability(spam rate)] / probability(disposable email domain, IP address, etc...)
Or something like that.
Also — it's a delight to have one of Patrick's articles mentioned in connection with this!
> "The optimal amount of fraud is non-zero" [1] (wrt banking systems)
It's a bit like how each 9 of runtime is an order of magnitude (ish) more expensive to achieve, and most use cases don't care if it's 99.999% or 99.9999%.
Free tier and free trial abuse is a huge problem, but also a huge opportunity.
We have seen customers where free tier abusers created 80k+ accounts in a day and cost millions of dollars. We have also seen businesses, like Oddsjam add significant revenue by prompting abusers to pay.
The phycology of abuse is also quite interesting, where even what appears to be serious abusers (think fake credit cards, new email accounts etc.) will refuse a discount and pay full price if they feel they 'got caught'
I’d love to hear more about the idea that somebody making a fraudulent signup with a stolen credit card is potentially going to pay full price if they “get caught”
Great writeup. Simple heuristics very often work wonders. The fraudsters are out there and try to pinch holes in your shield.
Some time ago we were running a mobile service provider and had some issues with fraudulent postpaid subscribers - however the cost of using background checking services was substantial. We solved it quite effectively by turning the background checks on when the level of fraud went over a certain threshold which made them go away for some weeks. We kept this on and off pattern for a very long time with great success as it lowered the friction to sign up significantly when turned off…
When sites use an AI generated image like this and don't bother to spend 10 seconds looking to make sure it looks okay (UIGN SIGN UPP? AISK ANACIS?) it makes me question whether that same level of care was put into writing the article.
Isn't it nice to have just a little bit of an illustration instead of just text? Obviously an AI-generated image is going to spit out some nonsense text as part of the graphic, but we're not really trying to hide that it's AI generated.
I get why they don't want to share their detection mechanics for potential fraudulent signups, but that is a very interesting topic to learn and discuss.
Apple‘s mail privacy protection creates disposable addresses with host icloud.com. It’s not as hassle free and can’t be automated, but this could definitely be used to create a lot of free accounts. But I don’t see them banning this domain I guess?
We are mainly B2B so we don't really see signups using Apple's email relay. That said, it could be something we might have to consider blocking in the future if it becomes a problem.
For paying customers, it probably doesn't make a lot of sense to use an anonymous email address, since we ask for your name and billing address either way (have to stay compliant with sales taxes!)
The other versions of recaptcha show the annoying captchas, but v3 just monitors various signals and gives a score indicating the likelihood that it's a bot.
We use this to reduce spam in some parts of our app, and I think there's an opportunity to make a better version, but it'd be tough for it to be better enough that people would pay for it since Google's solution is decent and free.
very cool, I wasn't expecting to find this so interesting. I yesterday for the first time thought about the "abuse the free tier" actors. I was trying to use a batching job service which limited free-tier batch sizes to 5, which was so low that it took away the point from using the automated job in the first place. I think the little info box explained that they keep the limit low to prevent abuse, and I started thinking about other ways they could prevent that abuse. Your post was very topical. thanks for sharing!
Neither is a viable option, otherwise all the big players would've done this a long time ago. Nothing is stopping you from creating a throwaway account on Gmail while someone using a custom domain might be your new B2B lead. There's no realistic way to tell which it is simply from the domain.
AceJohnny2|1 year ago
"Why do things suck?" Because parasites ruined it for the rest of us.
> We have to accept a certain amount of abuse. It is a far better use of our time to use it improving Geocodio for legitimate users rather than trying to squash everyone who might create a handful of accounts
Reminds me of Patrick McKenzie's "The optimal amount of fraud is non-zero" [1] (wrt banking systems)
Also, your abuse-scoring system sounds a bit like Bayesian spam filtering, where you have a bunch of signals (Disposable Email, IP from Risky Source, Rate of signup...) that you correlate, no?
[1] https://www.bitsaboutmoney.com/archive/optimal-amount-of-fra...
mjwhansen|1 year ago
I suppose you could call it inspired by Bayesian inference since we're using multiple pieces of independent evidence to calculate a score, though that makes it sound a bit fancier than it is and we aren't using the Bayes' theorem. But it's possible I had that in the back of my head from a game theory class I took long ago.
But for the fun of it, let's model it that way:
Probability (Spam | disposable email domain, IP address, etc... ) = [probability(disposable email domain, IP address, etc... | spam) x prior probability(spam rate)] / probability(disposable email domain, IP address, etc...)
Or something like that.
Also — it's a delight to have one of Patrick's articles mentioned in connection with this!
dehrmann|1 year ago
It's a bit like how each 9 of runtime is an order of magnitude (ish) more expensive to achieve, and most use cases don't care if it's 99.999% or 99.9999%.
caydenm|1 year ago
We have seen customers where free tier abusers created 80k+ accounts in a day and cost millions of dollars. We have also seen businesses, like Oddsjam add significant revenue by prompting abusers to pay.
The phycology of abuse is also quite interesting, where even what appears to be serious abusers (think fake credit cards, new email accounts etc.) will refuse a discount and pay full price if they feel they 'got caught'
akerl_|1 year ago
oger|1 year ago
benabbott|1 year ago
thecodemonkey|1 year ago
prteja11|1 year ago
thecodemonkey|1 year ago
manmal|1 year ago
thecodemonkey|1 year ago
For paying customers, it probably doesn't make a lot of sense to use an anonymous email address, since we ask for your name and billing address either way (have to stay compliant with sales taxes!)
polishdude20|1 year ago
kylecazar|1 year ago
https://www.geocod.io/code-and-coordinates/2025-01-13-how-ge...
gwbas1c|1 year ago
IE, send email, IP, browser agent, and perhaps a few other datapoints to a service, and then get a "fraudulent" rating?
the_bear|1 year ago
The other versions of recaptcha show the annoying captchas, but v3 just monitors various signals and gives a score indicating the likelihood that it's a bot.
We use this to reduce spam in some parts of our app, and I think there's an opportunity to make a better version, but it'd be tough for it to be better enough that people would pay for it since Google's solution is decent and free.
miki123211|1 year ago
hn_user82179|1 year ago
thecodemonkey|1 year ago
EGreg|1 year ago
or perhaps a really big whitelist of good ones? that would be extremely helpful!
Etheryte|1 year ago
thecodemonkey|1 year ago
I would probably not recommend implementing a whitelist for blocking purposes. But perhaps domains on a whitelist could get a slight scoring bump.
[1] https://github.com/disposable-email-domains/disposable-email... [2] https://github.com/disposable/disposable [3] https://github.com/unkn0w/disposable-email-domain-list
pigeons|1 year ago
AutistiCoder|1 year ago
thecodemonkey|1 year ago