top | item 42214819

(no title)

timjver | 1 year ago

> It would be similar to if I claimed that an LLM is an expert doctor, but in my data I've filtered out all of the times it gave incorrect medical advice.

Computationally it's trivial to detect illegal moves, so it's nothing like filtering out incorrect medical advice.

discuss

KK7NIL|1 year ago

> Computationally it's trivial to detect illegal moves

You're strictly correct, but the rules for chess are infamously hard to implement (as anyone who's tried to write a chess program will know), leading to minor bugs in a lot of chess programs.

For example, there's this old myth about vertical castling being allowed due to ambiguity in the ruleset: https://www.futilitycloset.com/2009/12/11/outside-the-box/ (Probably not historically accurate).

If you move beyond legal positions into who wins when one side flags, the rules state that the other side should be awarded a victory if checkmate was possible with any legal sequence of moves. This is so hard to check that no chess program tries to implement it, instead using simpler rules to achieve a very similar but slightly more conservative result.

adelineJoOs|1 year ago

That link was new too me, thanks! However: I wrote some chess-program myself (nothing big, hobby level) and I would not call it hard to implement. Just harder than what someone might assume initially. But in the end, it is one of the simpler simulations/algorithms I did. It is just the state of the board, the state of the game (how many turns, castle rights, past positions for the repetition rule, ...) and picking one rule set if one really wants to be exact.

(thinking about which rule set is correct would not be meaningful in my opinion - chess is a social construct, with only parts of it being well defined. I would not bother about the rest, at least not when implementing it)

By the way: I read "Computationally it's trivial" as more along the lines of "it has been done before, it is efficient to compute, one just has to do it" versus "this is new territory, one needs to come up with how to wire up the LLM output with an SMT solver, and we do not even know if/how it will work."

admax88qqq|1 year ago

> You're strictly correct, but the rules for chess are infamously hard to implement

Come on. Yeah they're not trivial but they've been done numerous times. There's been chess programs for almost as long as there have been computers. Checking legal moves is a _solved problem_.

Detecting valid medical advice is not. The two are not even remotely comparable.

elif|1 year ago

We implemented a whole chess engine in lisp during 3rd year it was really trivial actually implementing the legal move/state checking.

rco8786|1 year ago

I got a kick out of that link. Had certainly never heard of "vertical castling" previously.

wavemode|1 year ago

As I wrote in another comment - you can write scripts that correct bad math, too. But we don't use that to claim that LLMs have a good understanding of math.

ben_w|1 year ago

I'd say that's because we don't understand what we mean by "understand".

Hardware that accurately performs maths faster than all of humanity combined is so cheap as to be disposable, but I've yet to see anyone claim that a Pi Zero has "understanding" of anything.

An LLM can display the viva voce approach that Turing suggested[0], and do it well. Ironically for all those now talking about "stochastic parrots", the passage reads:

"""… The game (with the player B omitted) is frequently used in practice under the name of viva voce to discover whether some one really understands something or has ‘learnt it parrot fashion’. …"

Showing that not much has changed on the philosophy of this topic since it was invented.

[0] https://academic.oup.com/mind/article/LIX/236/433/986238

SpaceManNabs|1 year ago

I don't know. I have talked to a few math professors, and they think LLMs are as good as a lot of their peers when it comes hallucinations and being able to discuss ideas on very niche topics, as long as the context is fed in. If Tao is calling some models "a mediocre, but not completely incompetent [...] graduate student", then they seem to understand math to some degree to me.

fijiaarone|1 year ago

Actually, LLMs do call scripts that correct bad math, and have gotten progressively better because of it. It's another special case example.