(no title)
blister | 2 years ago
But they're sitting behind Cloudflare and aggressively blocking attempts to fetch data programmatically, which is a huge problem for us with 6000+ players worth of data to fetch multiple times every 3 months.
So... I built a Chrome Extension to grab the data at a speed that is usually under their detection rate. Basically created a distributed scraper and passed it out to as many people in the league as I could.
For big jobs when we want to do giant batches, it was a simple matter of doing the pulls and when we start getting 429 errors (rate limit blocking code they use), switch to a new IP on the VPN.
The only way they can block us now is if they stop having a website.
Give one of the commercial VPN providers a try. They're usually pretty cheap and have tons of IPs all over the place. Adding a "VPN Disconnect / Reconnect" step to the process only added about 10 seconds per request every so often.
kbenson|2 years ago
sublinear|2 years ago
You'd have to disable CSP manually in your browser config to make it work, but that leaves you with an insecure browser and a lot of friction for casual users. Not sure if you can tie about:config options to a user profile for this use case. Distributing a working extension/script is getting harder all the time.
throwawayadvsec|2 years ago
the best of the best are 4G rotating proxies
the fingerprint needs to change also