top | item 873146

Ask HN: An ideea about a link checker

11 points| sirrocco | 16 years ago

I've been thinking lately about building an application takes in an url and basically crawls that entire site.

It would be an easy way of finding problems with the a site (like 404, 500 ...etc).

I tried the webmaster tool from google - but I know I have a 404 link on my site , yet it didn't find it. I also tried some other sites that do crawl a website but they had a limit of 2-300 links and then it would stop. You would then have to buy an app that was a bit expensive.

I'm thinking a web app where you can point it at your site and just receive a report when it's done.

Would anyone need something like this ?

12 comments

order
[+] Tichy|16 years ago|reply
Something with a spell checker would be nice.

Edit: actually, no pun intended, even though there is a spelling error in the title.

[+] sirrocco|16 years ago|reply
Yeah, my bad about the title. But I don't really see a spell checker as something you want . If you have .. 1000 pages and I find 1 error in 10% of them, it would be a lot of work to correct that work. Not sure anyone would actually go and correct them.

But it could be a premium service i guess.

[+] timanglade|16 years ago|reply
I do get the added-value of having it as a webapp for some but as a developer, I'd rather use Tarantula in my test suite. Hunts down 404s, 500s but can also do HTML validation, check against common attacks (CSRF, XSS, etc.)

http://github.com/relevance/tarantula

[+] simplegeek|16 years ago|reply
Nice, thanks for sharing. Do you any similar Python software?
[+] sirrocco|16 years ago|reply
nice, didn't know about that one.
[+] edo|16 years ago|reply
Hi everybody. I'm from Linkvive and we've actually been building such a service for the last few months. We're very excited to see interest in it on HN and look forward to sharing our service with you guys soon. Signup at our form (http://linkvive.com) and we'll send you a single e-mail when it's live in the next few months. Cheers!
[+] daleharvey|16 years ago|reply
this is on the list of things I have wanted to do at some point.

I believe a wget -spider should help you find any 404's, but I wanted to have each link validated as well, and itd be nice to have as a simple web service

[+] sirrocco|16 years ago|reply
Yea, it could validate the html on the page, could show some statistics - like linkcount, time it took to download the html ( it could then check for the links of the img tags )

I was thinking that there could be a plan wher you would have a scan a month to see if any problems appeared in the meantime.

[+] maxklein|16 years ago|reply
Do your research well. There are quite a number of such apps out there already.
[+] sirrocco|16 years ago|reply
It would be great if you could give a couple of links, I did search but didn't find something that made me think : ok - they are doing this so great that it's next to pointless to even start.

The ones that I found , I didn't really like - which is I'm even asking here - If I can't find others doing this , maybe nobody wants something like this.