top | item 41473135

(no title)

wis | 1 year ago

Why do LLMs understand markdown really well? (besides the simple, terse and readable syntax of markdown)

They say "LLMs are trained on the web", are the web pages converted from HTML into markdown before being fed into training?

discuss

order

nprateem|1 year ago

I think it says in the Anthropic docs they use markdown internally (I assume that means were trained on it to a significant extent).

cpursley|1 year ago

I think Anthropic actually uses xml and OpenAI markdown.

ascorbic|1 year ago

They're trained on lots of code, and pretty much every public repo has markdown in it, even if it's just the README.