Would you run the LLM extractor across every page? Especially for larger scale projects, such as scraping entire product catalogues, this sounds very expensive. Maybe you could use the AI to generate selectors from examples that can then be applied to all other pages of the same structure?
No comments yet.