top | item 44041303

(no title)

I do understand it but i also think that the current LLMs are the first step to it.

GPT-3 started proper investment into this topic, there was not enough research done in this direction and now it is. People like Yann LeCun already analyse different approaches/architecture but they still use the infrastructure of LLMs (ML/GPUs) and potentially the data.

I never said that LLM is the breaktrhough in consesnes.

But you can also ask LLM strategies for thinking. It can tell you a lot of things. We will see if a LLM will be a fundamental part of AGI or not but GPU/ML will probably be.

I also think that the compression mechanism through LLM lead to concepts through optimization. You can see from the antropic paper, that an LLM doesn't work in normal language space but in a high dimensional one and then 'expresses' the output in a language you like.

We also see that real multi modal models are better in a lot of tasks due to a lot more context available through them. Estimating what someone said due to context.

The necessary infrastructure and power requirement is something i accept too. We can assume, i do, that further progress in a lot of topics will require this type of compute and it also solves our data bottleneck: normal CPU architecture is limited by memory databus.

Also in comparision to a lot of other companies, if the richest companies in the world invest in nuclear, i think this is a lot better than any other companies. They have a lot higher margins and knowledge. co2 is a market separator for them too.

I also expect this amount of compute to be the base for fixing real issues we all face like cancer or optimizing cancer or any other sickness detection. We need to make medicin a lot cheaper and if someone in africa can do a cheap x ray and send it to the cloud to get any feedback, that would / could help a lot of people.

Doing complex and massive protein analysis or mRna research in virtual space, also requires GPUs.

All of this happened in a timespan of only a few years. I have not seen anything progressing as fast as AI/ML currenly does and as unfortunate it is, this needs compute.

Even my small inhouse image recognition fine tuning explodes when you do a handful parameter optimizations but the quality is a lot better than what we had before.

And enabling people to have real natural language UI is HUGE. It makes so much more accessable. Not just for people with a disability.

Things like 'do a eli5 on topic x'. "explain to me this concept" etc. I would have loved that when i tried to be successful in the university math curiculum.

All of that is already crazy and still is. But in parallel what Nvidia and others currently do with ML and Robotics is also something which requires all of that compute. And the progress is again breath taking. The current flood of basic robots standing and walking around is due to ML.

discuss

th0ma5|9 months ago

I mean, you're not even wrong ! Most all of these large models are based on the idea that if you put all of the representations that we can of the world into a big pile that you can tease out some kind of meaning. There's not even really a cohesive theory as to that, and surely no testable way to prove that it's true. It certainly seems like you can make a system that behaves as if it could be like that, and I think that's what you're picking up on. But it's actually probably something else and something far shorter of that.

hnaccount_rng|9 months ago

There is an interesting analogy that my Analysis I professor once said: The intersection of all valid examples are also a definition of an object. In many ways this is, at least in my current understanding, how ML systems "think". So yeah it will take some superposition of examples and kind of try to interpolate between those. But fundamentally it is - at least so far - always an interpolation, not an extrapolation.

Whether we consider that "just regurgitating Stackoverflow" or "it thought up the solution to my problem" mostly comes up to semantics