top | item 46185128

(no title)

mirandrom | 2 months ago

I went down a rabbit hole and found most of the missing lists on Common Crawl: https://mirandrom.github.io/bourdain-lists/

Unfortunately, AFAICT, the embedded image data were not included in the Common Crawl scrapes, and a few of the image URLs I tried don't seem indexed by Common Crawl. I only just started playing around with these tools so I might've missed something.

discuss

order

ccgreg|2 months ago

Common Crawl is a text-only crawl.