top | item 31856850

(no title)

mrgalaxy | 3 years ago

I think I slightly misunderstood the point of this site, people probably do want fast updates. But I still think the price should be brought down. You could probably save costs by pooling the crawling, eg. if multiple subscribers request the same page, do that crawl once for all of them. You may already be doing that.

discuss

order

chptung|3 years ago

RE: pooling - yup, pooling for the win! As a self taught programmer, this was one block of code I was particularly proud of :)

zasdffaa|3 years ago

Crawling too fast from the same IP addy can get that IP blocked. Per 10 mins prob ok, per 1 minute may be too much depending on the volume being downloaded and the sensitivity (you mention saleable stuff in another comment so this might well make a difference). FYI. I did a reasonable amount of scraping myself.

(err, crawling = scraping? or maybe they different things)

Edit: tidy + clarify