top | item 43690074

(no title)

ilyabez | 10 months ago

Hi, I'm the author of the blog (though I didn't post it on HN).

I've addressed this topic in another comment above and will copy it here.

I'd encourage you to read up on how the podcast ecosystem works.

Podcasts are distributed via RSS feeds hosted all over the internet, but mostly on specialized hosting providers like Transistor, Megaphone, Omny Studio, etc. that are designed to handle huge amounts of traffic.

All podcast apps (literally, all of them) like Apple Podcasts, Spotify, YouTube Music, Overcast, Pocket Casts, etc. constantly crawl and download RSS feeds, artwork images and mp3s from podcast hosts.

This is how podcasts are distributed since they were introduced by Apple in early 2000s. This is why podcasting still remains an open, decentralized ecosystem.

discuss

order

randunel|10 months ago

Replace "podcasts" with "search results" in your comment, and "RSS feed" with "LLM output" and you've got yourself the exact same argument for what's going on today. The company names are different, of course, but not by much because some of the players stayed the same.

Your lack of reply to "do you observe robots.txt when you download content such as images" is basically a "no".

cratermoon|10 months ago

If they are well-coded, they don't constantly crawl. They use and pay attention to headers like ETag, If-Modified-Since and/or If-None-Match and support conditional requests.

Badly behaving RSS readers on the other hand....

https://rachelbythebay.com/w/2024/05/27/feed/