top | item 42877624

(no title)

Users can be adversarial to the “truth” (to the extent it exists) without being adversarial in intent.

Dinosaur bones are either 65 million year old remnants of ancient creatures or decoys planted by a God during a 7 day creation, and a large proportion of humans earnestly believe either take. Choosing which of these to believe involves a higher level decision about fundamental worldviews. This is an extreme example, but incorporating “honest” human feedback on vaccines, dark matter, and countless other topics won’t lead to de facto improvements.

I guess to put it another way: experts don’t learn from the masses. The average human isn’t an expert in anything, so incorporating the average feedback will pull a model away from expertise (imagine asking 100 people to give you grammar advice). You’d instead want to identify expert advice, but that’s impossible to do from looking at the advice itself without giving into a confirmation bias spiral. Humans use meta-signals like credentialing to augment their perception of received information, yet I doubt we’ll be having people upload their CV during signup to a chat service.

And at the cutting edge level of expertise, the only real “knowledgeable” counterparties are the physical systems of reality themselves. I’m curious how takeoff is possible for a brain in a bottle that can’t test and verify any of its own conjectures. It can continually extrapolate down chains of thought, but that’s most likely to just carry and amplify errors.

discuss

benbosco|1 year ago

Dirac’s prediction of antimatter came from purely mathematical reasoning—before any experimental evidence existed. Testing and verifying conjectures requires the ability to extrapolate beyond known data, rather than from it, and the ability discard false leads based on theoretical reasoning, rather than statistical confidence.

All of this is possible in a bottle, but laughably far beyond our current capabilities.

sirsinsalot|1 year ago

This is a good take. What models seem to be poor at is undoing their own thinking down a path even when they can test.

If you let a model write code, test it, identify bugs and fix them, you get an increasingly obtuse and complex code base where errors happen more. The more it iterate the worse it gets.

At the end of the day, written human language is a poor way of describing software. Even to a model. The code is the description.

At the moment we describe solutions we want to see to the models and they aren't that smart about translating that to an unambiguous form.

We are a long was off describing the problems and asking for a solution. Even when the model can test and iterate.

econ|1 year ago

Same way corporations do it, they hire humans and other companies to do things. Organisations already have a mind of their own with more drive to survive than an llm.