top | item 40914278

(no title)

jancurn | 1 year ago

The main advantage (for now) is that the library has a single interface for both HTTP and headless browsers, and bundled auto scaling. You can write your crawlers using the same base abstraction, and the framework takes care of this heavy lifting. Developers of scrapers shouldn't need to reinvent the wheel, and just focus on building the "business" logic of their scrapers. Having said that, if you wrote your own crawling library, the motivation to use Crawlee might be lower, and that's fair enough.

Please note that this is the first release, and we'll keep adding many more features as we go, including anti-blocking, adaptive crawling, etc. To see where this might go, check https://github.com/apify/crawlee

discuss

order

robertlagrant|1 year ago

Can I ask - what is anti-blocking?

fullspectrumdev|1 year ago

Usually refers to “evading bot detection”.

Detecting when blocked and switching proxy/“browser fingerprint”.