top | item 46975514

(no title)

NiloCK | 18 days ago

Grey market fast-follow via distillation seems like an inevitable feature of the near to medium future.

I've previously doubted that the N-1 or N-2 open weight models will ever be attractive to end users, especially power users. But it now seems that user preferences will be yet another saturated benchmark, that even the N-2 models will fully satisfy.

Heck, even my own preferences may be getting saturated already. Opus 4.5 was a very legible jump from 4.1. But 4.6? Apparently better, but it hasn't changed my workflows or the types of problems / questions I put to it.

It's poetic - the greatest theft in human history followed by the greatest comeuppance.

No end-user on planet earth will suffer a single qualm at the notion that their bargain-basement Chinese AI provider 'stole' from American big tech.

discuss

jaccola|18 days ago

I have no idea how an LLM company can make any argument that their use of content to train the models is allowed that doesn't equally apply to the distillers using an LLM output.

"The distilled LLM isn't stealing the content from the 'parent' LLM, it is learning from the content just as a human would, surely that can't be illegal!"...

mikehearn|18 days ago

The argument is that converting static text into an LLM is sufficiently transformative to qualify for fair use, while distilling one LLM's output to create another LLM is not. Whether you buy that or not is up to you, but I think that's the fundamental difference.

amenhotep|18 days ago

When you buy, or pirate, a book, you didn't enter into a business relationship with the author specifically forbidding you from using the text to train models. When you get tokens from one of these providers, you sort of did.

I think it's a pretty weak distinction and by separating the concerns, having a company that collects a corpus and then "illegally" sells it for training, you can pretty much exactly reproduce the acquire-books-and-train-on-them scenario, but in the simplest case, the EULA does actually make it slightly different.

Like, if a publisher pays an author to write a book, with the contract specifically saying they're not allowed to train on that text, and then they train on it anyway, that's clearly worse than someone just buying a book and training on it, right?

TZubiri|18 days ago

Because the terms by each provider are different

American Model trains on public data without a "do not use this without permission" clause.

Chinese models train on models that have a "you will not reverse engineer" clause.

miohtama|18 days ago

In some ways, Opus 4.6 is a step backwards due to massively higher token consumption.

theshrike79|17 days ago

You need to adjust the effort from the default (High) to Medium to match the token usage of 4.5

High is for people with infinite budgets and Anthropic employees. =)

nwienert|18 days ago

For me, it's just plain worse.

chillfox|18 days ago

yeah, I am still using 4.5 for coding.

I have started using Gemini Flash on high for general cli questions as I can't tell the difference for those "what's the command again" type questions and it's cheap/fast/accurate.

deaux|17 days ago

> But 4.6? Apparently better, but it hasn't changed my workflows or the types of problems / questions I put to it.

The incremental steps are now more domain-specific. For example, Codex 5.3 is supposedly improved at agentic use (tools, skills). Opus 4.6 is markedly better at frontend UI design than 4.5. I'm sure at some point we'll see across-the-board noticeable improvement again, but that would probably be a major version rather than minor.

vessenes|18 days ago

Just to say - 4.6 really shines on working longer without input. It feels to me like it gets twice as far. I would not want to go back.

cmrdporcupine|18 days ago

If that's what they're tuning for, that's just not what I want. So I'm glad I switched off of Anthropic.

What teams of programmers need, when AI tooling is thrown into the mix, is more interaction with the codebase, not less. To build reliable systems the humans involved need to know what was built and how.

I'm not looking for full automation, I'm looking for intelligence and augmentation, and I'll give my money and my recommendation as team lead / eng manager to whatever product offers that best.

throwaw12|18 days ago

not allowing distillation should be illegal :)

One can create 1000s of topic specific AI generated content websites, as a disclaimer each post should include prompt and used model.

Others can "accidentally" crawl those websites and include in their training/fine-tuning.

hasperdi|17 days ago

Why distill, if you can run the full model yourself... or at other inference providers.

Quantization the better approach in most cases, unless you want to for instance create hybrid models ie. distilling from here and there.

jona-f|17 days ago

"the greatest theft in human history" what a nonsense. I was curious, how the AI haters will cope, now that the tides here have changed. We have built systems that can look at any output and replicate it. That is progress. If you think some particular sequence of numbers belongs to you, you are wrong. Current intellectual property laws are crooked. You are stuck in a crooked system.

benterix|17 days ago

> No end-user on planet earth will suffer a single qualm at the notion that their bargain-basement Chinese AI provider 'stole' from American big tech.

Just like nobody cares[0] that American big tech stole from authors of millions of books.

[0] Interestingly, the only ones that cared were the FB employees told to pirate the Library Genesis and reporting back that "it didn't feel right".

DannyBee|17 days ago

As one of those authors (3 books in this case) I'll just point out:

Most authors don't own any interesting rights to their books because they are works for hire.

Maybe I would have gotten something, maybe not. Depends on the contract. One of my books that was used is from 1996. That contract did not say a lot about the internet, and I was also 16 at the time ;)

In practice they stole from a relatively small number of publishers. The rest is PR.

The settlement goes to authors in part because anything else would generate immensely bad PR.

As usual, nothing is really black and white