I've been working with scrapers quite a lot. I started with python requests, then to scrapy, then selenium, then selenium via undetected_chromedriver, and once that started being detected during a chrome update about a year ago, I've switched over to seleniumbase. It got by undetected, but to get it working with pre-downloaded drivers, I had to look into the code. I have never, and I mean never, in all my python years, seen such a horrible mess of code. We are talking 1000lines long methods, with 20-30 different flags and branches Just horrible. I have since switched to Playwright, which seems to be also undetected, and offers a much saner interface.
seleniumbase|1 year ago
cyanmagenta|1 year ago
That said, I just spent a few minutes browsing the SeleniumBase repro, and honestly it didn’t seem that unusual to me. Would be interested in seeing a specific example of what the commenter had in mind.
mdaniel|1 year ago
harrall|1 year ago
seleniumbase|1 year ago
bryanrasmussen|1 year ago
That's actually why I've been scrapping my Playwright automation (because I expect I will encounter problems even if hasn't happened yet, cynical and paranoid) and moving towards writing a browser extension to automate Firefox.
Basically my use case is automating tedious things for myself not running bots at scale, so that's why it is imperative not to get caught being "not human", because then risk account problems.
robertlagrant|1 year ago
pryelluw|1 year ago
seleniumbase|1 year ago
edm0nd|1 year ago
I guess my point is, you dont have to be undetected nor write 1000 lines of code to scrape or do whatever you are needing to do always. Saved me a ton of headaches and time when captchas are involved.
mintzworld|1 year ago