pablohoffman | 10 years ago | on: Web scraping at scale – a conversation with Pablo Hoffman from ScrapingHub
pablohoffman's comments
pablohoffman | 10 years ago | on: Ask HN: Examples of companies with cofounders starting together at distance
pablohoffman | 10 years ago | on: Show HN: CloudScrape – Cloud-based web scraping platform
pablohoffman | 11 years ago | on: Show HN: Scrape.it – Change-Resilient Web Scraper
It's been getting quite a bit of traction and we're currently working on the integration with Scrapinghub platform (disclaimer: I work there) for those who prefer a hosted version.
pablohoffman | 12 years ago | on: Portia, an open-source visual web scraper
pablohoffman | 12 years ago | on: Portia, an open-source visual web scraper
pablohoffman | 13 years ago | on: Why MongoDB is a bad choice for storing our scraped data
Cloudera has in fact been an inspiration for us to follow, you guys have really struck the right balance between open source and commercial support. We follow the same philosophy with Scrapy (an open source web crawling framework), as you do with Hadoop and its ecosystem.
pablohoffman | 13 years ago | on: Why MongoDB is a bad choice for storing our scraped data
FWIW, we still use Mongo in other internal applications, it's just not the right choice for our crawl data storage backend.
pablohoffman | 13 years ago | on: Why MongoDB is a bad choice for storing our scraped data
pablohoffman | 14 years ago
pablohoffman | 14 years ago | on: Free 5 Billion Page Web Index Now Available from Common Crawl Foundation
I now regret since this one got much more attention. I was under the impression that linking to the original post was more welcomed here HN, but it seems this is not always the case.
pablohoffman | 14 years ago | on: One-time Secret: Share passwords etc with URIs that work only once
pablohoffman | 14 years ago | on: Google Please Hire Me
pablohoffman | 14 years ago | on: Linus Torvalds dumps Gnome3 for XFCE (G+ discussion)
pablohoffman | 15 years ago | on: Start registeri.ng, .NG domains go on sale
Example: http://www.webdomains.com.ng/myaccount/whois.php?step=1
It's a shame they're missing the opportunity to sell such a nice top-level domain - or perhaps it's done on purpose?
pablohoffman | 15 years ago | on: Ask HN: What would it take to scrape & index delicious?
pablohoffman | 15 years ago | on: Mark Zuckerberg Named TIME’s 2010 Person of the Year
pablohoffman | 15 years ago | on: Is OpenTable Worth it?
You've found market price when buyers complain but still pay.
http://twitter.com/#!/paulg/status/22576762202pablohoffman | 15 years ago | on: Facebook acquires file-sharing service Drop.io
pablohoffman | 15 years ago | on: Why Mongrel2 Doesn't Use INI Files (Zed Shaw)