It's still strange to me to work in a field of computer science where we say things like "we're not exactly sure how these numbers (hyper parameters) affect the result, so just try a bunch of different values and see which one works best."
> "we're not exactly sure how these numbers (hyper parameters) affect the result, so just try a bunch of different values and see which one works best."
Isn't it the same for anything that uses a Monte Carlo simulation to find a value? At times you'll end up on a local maxima (instead of the best/correct) answer, but it works.
We cannot solve something used a closed formula so we just do a billion (or whatever) random samplings and find what we're after.
I'm not saying it's the same for LLMs but "trying a bunch of different values and see which one works best" is something we do a lot.
LLMs were very much engineered... the exact results they yield are hard to determine since they're large statistical models, but I don't think that categorizes the LLMs themselves as a 'discovery' (like say Penicilin)
I understand your distinction, I think, but I would say it is more engineering than ever. It's like the early days of the steam engine or firearms development. It's not a hard science, not formal analysis, it's engineering: tinkering, testing, experimenting, iterating.
I believe, from what I saw in Mathematics, this is a matter of taste. Discovered or invented are 2 perspectives. Some people prefer to think that light is reaching in previously dark corners of knowledge waiting to be discovered(discover). Others prefer to think that by force of genius they brought the thing into the world.
To me, personally, these are 2 sides of the coin, without one having more proof than the other.
This can be laid at the feet of Minsky and others who dismissed perceptrons because they couldn't model nonlinear functions. LLMs were never going to happen until modern CPUs and GPUs came along, but that doesn't mean we couldn't have a better theoretical foundation in place. We are years behind where we should be.
When I worked in the games industry in the 1990s, it was "common knowledge" that neural nets were a dead end at best and a con job at worst. Really a shame to lose so much time because a few senior authority figures warned everyone off. We need to make sure that doesn't happen this time.
we have no theories of intelligence. We're like people in the 1500s trying to figure out why and how people get sick, with no concept of bacteria, germs, transmission, etc
I haven't seen this key/buzzword mentioned yet, so I think part of it is the fact that we're now working on complex systems. This was already true (a social network is a complex system), but now we have the impenetrability of a complex system within the scope of a single process. It's hard to figure out generalizable principles about this kind of thing!
I mean, it’s kind of in the name isn’t it? Computer science. Science is empirical, often poorly understood and even the best theories don’t fully explain all observations, especially when a field gets new tools to observe phenomena. It takes a while for a good theory to come along and make sense of everything in science and that seems like more or less exactly where we are today.
Welcome to engineering. We don't sketch our controlled systems and forget all about systems theory. Instead we just fiddle with out controllers until the result is acceptable.
TacticalCoder|2 years ago
Isn't it the same for anything that uses a Monte Carlo simulation to find a value? At times you'll end up on a local maxima (instead of the best/correct) answer, but it works.
We cannot solve something used a closed formula so we just do a billion (or whatever) random samplings and find what we're after.
I'm not saying it's the same for LLMs but "trying a bunch of different values and see which one works best" is something we do a lot.
r3trohack3r|2 years ago
I feel like most of our industry up until now has been engineered.
LLMs were discovered.
herval|2 years ago
SkyMarshal|2 years ago
arketyp|2 years ago
mejutoco|2 years ago
To me, personally, these are 2 sides of the coin, without one having more proof than the other.
justanotheratom|2 years ago
SkyMarshal|2 years ago
Ideally we want theoretical foundations, but sometimes random explorations are necessary to tease out enough data to construct or validate theory.
CamperBob2|2 years ago
When I worked in the games industry in the 1990s, it was "common knowledge" that neural nets were a dead end at best and a con job at worst. Really a shame to lose so much time because a few senior authority figures warned everyone off. We need to make sure that doesn't happen this time.
spidersenses|2 years ago
UberFly|2 years ago
fierro|2 years ago
thatguysaguy|2 years ago
manojlds|2 years ago
FuckButtons|2 years ago
jncfhnb|2 years ago
amelius|2 years ago
raxxorraxor|2 years ago
stormfather|2 years ago
jejeyyy77|2 years ago