top | item 47151174

(no title)

michaeld123 | 6 days ago

I'm the author. Updated the article based on this thread — thanks to everyone who pushed back. Changes: reframed "boiling water" as deliberately on-the-fence rather than asserting it's a word; added a note that the obscure end of the slider is noise; acknowledged collocations as the established term; added a German/Dutch/Norwegian section on how other languages handle the space problem; softened "wasn't possible before LLMs" to "wasn't practical"; and threaded the concept of "loaded" throughout as the key distinction. Many of the specific examples came directly from commenters here — credited below.

discuss

order

s1mon|6 days ago

I'd really love to see the prompt(s) you used with Claude. The way the article was written I mistakenly thought you would expand upon that in a footnote or sidebar.

michaeld123|6 days ago

It's not so much the prompt, as the volume. This overall project has involved >100M LLM inferences, spread across 1.9M headwords. the building block is "what words or short terms are related to X?", but scaled out. Plus a lot of filtering. So it's mostly a reflection of English, and also a reflection of what ChatGPT and Claude report back as a significant collocation.