(no title)
wavemode | 4 days ago
Similarly, if there are millions of academic papers and thousands of peer reviews in the training data, a review of this exact paper doesn't need to be in there for the LLM to write something convincing. (I say "convincing" rather than "correct" since, the author himself admits that he doesn't agree with all the LLM's comments.)
I tend to recommend people learn these things from first principles (e.g. build a small neural network, explore deep learning, build a language model) to gain a better intuition. There's really no "magic" at work here.
kristiandupont|4 days ago
Claude figured out how the language worked and debugged segfaults until the compiler compiled, and then until the program did. That might not be magic, but it shows a level of sophistication where referring to “statistics” is about as meaningful as describing a person as the statistics of electrical impulses between neurons.
compass_copium|4 days ago
c22|4 days ago
This is an interesting claim to me. Are there any models that exist that have been trained with a (single digit) number omitted from the training data?
If such a model does exist, how does it represent the answer? (What symbol does it use for the '7'?)
wavemode|4 days ago
Kim_Bruning|4 days ago
Took me a bit of messing around, but try to write out each state sequentially, with a check step between each.
ainch|4 days ago
One of the surprises of deep learning is that it can, sometimes, defy prior statistical learning theory to generalise, but this is still poorly understood. Concepts like grokking, double descent, and the implicit bias of gradient descent are driving a lot of new research into the underlying dynamics of deep learning. But I'd say it is pretty ahistoric to claim that this is obvious or trivial - decades of work studied "overfitting" and related problems where statistical models fail to generalise or even interpolate within the support of their training data.
arkh|4 days ago
I think they should be the perfect tool to find methods or results in a field which look like it could be used in another field.
WithinReason|4 days ago
red75prime|4 days ago
unknown|4 days ago
[deleted]
selridge|4 days ago
"I don't know how you get here from "predict the next word"" is not really so much a statement of ignorance where someone needs you to step in but a reflection that perhaps the tech is not so easily explained as that. No magic needs to be present for that to be the case.
wavemode|4 days ago
I engaged. You just don't like what I wrote. That's okay.