top | item 45168472

(no title)

solardev | 5 months ago

Can you prove that statistics cannot encode semantics?

discuss

Terr_|5 months ago

Compare: "Can you prove that alien explorers cannot make contact with us?"

Nobody has the tools to begin proving a negative [0] in either of those cases, and it's possible they'll eventually occur... But so what?

Just because it could happen someday does not mean it's happening now. Instead, we have decades of seeing humans excite themselves into perceiving semantics that aren't present [0], and nobody's provided a compelling reason to believe that this time things are finally different.

[0] https://en.wikipedia.org/wiki/Burden_of_proof_(philosophy)

[1] https://en.wikipedia.org/wiki/ELIZA_effect

solardev|5 months ago

I don't think this is the unprovable you think it is?

If LLMs and statistics can't encode semantics, how can do chatbots perform long-form translations with appropriate contexts? How do codebreakers use statistics to break an adversary's communications?

Sometimes the statistics are semantic, like when "orange" and "arancia" the picture of that fruit all mean the same thing, but Orange the wireless carrier and orange the color are different. Those are connections/probabilities humans also learn via repeated exposure in different contexts.

I'm not arguing that LLMs are synthesizing new ideas (or old ones), but that they ARE capable of deriving semantic meaning from statistics. Rather than:

> language, based solely on statistical data, shorn of semantics

Isn't it more like:

> language, based solely on statistical data, with meanings emerging from clusters in the data