top | item 41676037

(no title)

In some domains (math and code), progress is still very fast. In others it has slowed or arguably stopped.

We see little progress in "soft" skills like creative writing. EQBench is a benchmark that tests LLM ability to write stories, narratives, and poems. The winning models are mostly tiny Gemma finetunes with single-digit parameter counts. Huge foundation models with hundreds of billions of parameters (Claude 3 Opus, Llama 3.1 405B, GPT4) are nowhere near the top. (Yes, I know Gemma is a pruned Gemini). Fine-tuning > model size, which implies we don't have a path to "superhuman" creative writing (if that even exists). Unlike model size, fine-tuning can't be scaled indefinitely: once you've squeezed all the juice out of a model, what then?

OpenAI's new o1 model exhibits amazing progress in reasoning, math, and coding. Yet its writing is worse than GPT4-o's (as backed by EQBench and OpenAI's own research).

I'd also mention political persuasion (since people seem concerned about LLM-generated propaganda). In June, some researchers tested LLM ability to change the minds of human subjects on issues like privatization and assisted suicide. Tiny models are unpersuasive, as expected. But once a model is large enough to generate coherent sentences, persuasiveness kinda...stops. All large models are about equally persuasive. No runaway scaling laws are evident here.

This picture is uncertain due to instruction tuning. We don't really know what abilities LLMs "truly" possess, because they've been crippled to act as harmless, helpful chatbots. But we now have an open-source GPT-4-sized pretrained model to play with (Llama-3.1 405B base). People are doing interesting things with it, but it's not setting the world on fire.

discuss

throw987987123|1 year ago

It feels ironic if the only thing that the current wave of Ai enables (other than novelty cases) is a cutdown of software/coding jobs. I don't see it replacing math professionals too soon for a variety of reasons. From an outsiders perspective on the software industry it is like it's practioners voted to make themselves redundant - that seems to be the main takeaway of ai to normal non tech people ive chatted with.

Many people have anecdotally, when I tell them what I do for a living, have told me that any other profession would have the common sense/street smarts to not make their scarce skill redundant. It goes further than that; many professions have license requirements, unions, professional bodies, etc to enforce this scarcity on the behalf on their members. After all a scarce career in most economies is one not just of wealth but higher social standing.

If all it does is allow us to churn more high level software, which let's be honest is demand inelastic due to mostly large margins on software products (i.e. they would of paid a person anyway due to ROI) it doesn't seem it will add much to society other than shifting profit in tech from Labor to Capital/owners. May replace call centre jobs too I guess and some low level writing jobs/marketing. Haven't seen any real new use cases that change my life yet positively other than an odd picture/ai app, fake social posts,annoying AI assistants in apps, maybe some teaching resources that would of been made/easy to acquire anyway by other means etc. I could easily live without these things.

If this is all it is seems Ai will do or mostly do it seems like a bit of a disappointment. Especially for the massive amount of money going into it.

Viliam1234|1 year ago

> many professions have license requirements, unions, professional bodies, etc to enforce this scarcity on the behalf on their members. After all a scarce career in most economies is one not just of wealth but higher social standing.

Well, that's good for them, but bad for humanity in general.

If we had a choice between a system where doctors get high salary and lot of social status, or a system where everyone can get perfect health by using a cheap device, and someone would choose the former, it would make perfect sense to me to call such person evil. The financial needs of doctors should not outweigh the health needs of humanity.

On a smarter planet we would have a nice system to compensate people for losing their privilege, so that they won't oppose progress. For example, every doctor would get a generous unconditional basic income for the rest of their life, and then they would be all replaced by cheap devices that would give us perfect health. Everyone would benefit, no reason to complain.

_w1tm|1 year ago

> If all it does is allow us to churn more high level software, which let's be honest is demand inelastic due to mostly large margins on software products (i.e. they would of paid a person anyway due to ROI) it doesn't seem it will add much to society other than shifting profit in tech from Labor to Capital/owners.

If creating software becomes cheaper then that means I can transform all the ideas I’ve had into software cheaply. Currently I simply don’t have enough hours in the day, a couple hours per weekend is not enough to roll out a tech startup.

Imagine all the open source projects that don’t have enough people to work on them. With LLM code generation we could have a huge jump in the quality of our software.

kobenni|1 year ago

It may seem this way from an outsiders perspective, but I think the intersection between people who work on the development of state-of-the-art LLMs and people who get replaced is practically zero. Nobody is making themselves redundant, just some people make others redundant (assuming LLMs are even good enough for that, not that I know if they are) for their own gain.

fluoridation|1 year ago

>But once a model is large enough to generate coherent sentences, persuasiveness kinda...stops. All large models are about equally persuasive. No runaway scaling laws are evident here.

Isn't that kind of obvious? Even human speakers and writers have problems changing people's minds, let alone reliably.

lmm|1 year ago

The ceiling may be low, but there are definitely human writers that are an order of magnitude more effective than the average can-write-coherent-sentences human.

klipt|1 year ago

The only people who changed minds reliably were Age of Empires priests. Wololo, wololo!

anon7725|1 year ago

> Tiny models are unpersuasive, as expected. But once a model is large enough to generate coherent sentences, persuasiveness kinda...stops.

People are persuaded to change their opinions based on social proof, so this isn’t surprising.