top | item 2678313

Show HN: Screen scraping for dummies app

4 points| combiclickwise | 14 years ago | reply

http://netreputation.co.uk/extractor/

I am embarrassed about this version but I thought I would develop it based on feedback.

The idea was to be able to create something that any one could start using very fast. Not necessarily programmers.

13 comments

order
[+] wrath|14 years ago|reply
It would be nice to be able to use xpaths to get to the data I want to extract. Having done quite a few extractors in the past I find xpaths to be the easiest way to finding the correct data. Just a thought.
[+] combiclickwise|14 years ago|reply
xpaths might be especially useful for programmers yes. I am thinking I will add support for it. Thanks :-)
[+] davewasthere|14 years ago|reply
It's kind of like a basic dapper? http://open.dapper.net/
[+] combiclickwise|14 years ago|reply
haven't used it much. I wanted a pattern engine so the intelligence finally rests with the user without the user needing to know regex or even programming.
[+] justliving|14 years ago|reply
nice! What are your future plans for it? E.g. which features do you plan to add for my grand-mother :-) ?
[+] combiclickwise|14 years ago|reply
thanks. couple of things I have in mind

1. more output formats like JSON, HTML, Doc, RSS, ATOM

2. allow programming in loops.. for example you want to be able to scrape many pages of data for a google search result then you need a hack in the url itself. This is difficult to accomplish but I am think I can add support to some of the pagination patterns to start with

3. to be able to give more control over the HTML, maybe a panel which helps extract the relevant one from the target page