(no title)
devit | 11 months ago
Theorems of the kind "model X always does this desirable thing" are almost always false (because it's an imprecise model), and theorems of the kind "model X always does this desirable thing Y% of the time" seem incredibly hard to prove, probably impossible unless it's feasible to try the model on all possible inputs.
Even formulating the theorem itself is often extremely hard or impossible, e.g. consider things like "this LLM does not output false statements".
mentalgear|11 months ago
[0] https://mateopetel.xyz/
Onavo|11 months ago
Tainnor|11 months ago