(no title)
xarball | 7 years ago
It looks like they're just trying to use selectors, but these directions seem to completely miss that functionality in Selenium's API. Just search the WebDriver documentation for 'find_element_by_':
https://selenium-python.readthedocs.io/api.html
I use Selenium for all my web crawling, exactly because I would rather have one crawler with all the backing support of a modern web browser, than corner myself into not having something as crucial as a JavaScript parser halfway through implementing a bot that's designed to hook what's basically an end-user interface sitting on top of all that.
The most obvious benefit of Selenium to me, is that by having all that, I can make my interactions with a web server look more like a user, and fly under the radar a little more. This tends to require less work on my part when I treat websites more like a whole package (though more RAM, yes!)
chsasank|7 years ago
xarball|7 years ago
I've recently started using Selenium with the privoxy proxy, exactly because browser headless modes are still fairly new tech. They don't all necessarily support all the standard profile features (addons, settings, etc), or behave the same way. It's really neat seeing where they're going, but they sometimes need a bit of help MITM-ing traffic, so that's where a good filter comes in handy.
In the user-facing web world, 'slow' is kind of a relative term. Even with a barebone system, you're nearly always going faster than most servers will put out. I just take my chances bringing in bigger tools, because the personal cost of maintaining an under-equipped tool is usually a greater time-waster to keep up to date as your target site evolves, than the personal cost of waiting for variably-optimized background work to perform its duties.