top | item 44263553

(no title)

spion | 8 months ago

What you think is an absurd question may not be as absurd as it seems, given the trillions of tokens of data on the internet, including its darkest corners.

In my experience, its better to simply try using LLMs in areas where they don't have a lot of training data (e.g. reasoning about the behaviour of terraform plans). Its not a hard cutoff of being _only_ able to reason exactly about solved things, but its not too far off as a first approximation.

The researchers took exiting known problems and parameterised their difficulty [1]. While most of these are not by any means easy for humans, the interesting observation to me was that the failure_N was not proportional to the complexity of the problem, but more with how common solution "printouts" for that size of the problem can typically be encountered in the training data. For example, "towers of hanoi" which has printouts of solutions for a variety of sizes went to very large number of steps N, while the river crossing, which is almost entirely not present in the training data for N larger than 3, failed above pretty much that exact number.

[1]: https://machinelearning.apple.com/research/illusion-of-think...

discuss

CSSer|8 months ago

It doesn't help that thanks to RLHF, every time a good example of this gains popularity, e.g. "How many Rs are in 'strawberry'?", it's often snuffed out quickly. If I worked at a company with an LLM product, I'd build tooling to look for these kinds of examples in social media or directly in usage data so they can be prioritized for fixes. I don't know how to feel about this.

On the one hand, it's sort of like red teaming. On the other hand, it clearly gives consumers a false sense of ability.

spion|8 months ago

Indeed. Which is why I think the only way to really evaluate the progress of LLMs is to curate your own personal set of example failures that you don't share with anyone else and only use it via APIs that provide some sort of no-data-retention and no-training guarantees.