WingNews

Palmik|2 months ago

DeepSeek and other Chinese companies. Not only do they publish research, they also put their resources where their mouth (research) is. They actually use it and prove it through their open models.

Most research coming out of big US labs is counter indicative of practical performance. If it worked (too) well in practice, it wouldn't have been published.

Some examples from DeepSeek:

https://arxiv.org/abs/2405.04434

https://arxiv.org/abs/2502.11089

abbycurtis33|2 months ago

[deleted]

mapmeld|2 months ago

Well it's cool that they released a paper, but at this point it's been 11 months and you can't download a Titans-architecture model code or weights anywhere. That would put a lot of companies up ahead of them (Meta's Llama, Qwen, DeepSeek). Closest you can get is an unofficial implementation of the paper https://github.com/lucidrains/titans-pytorch

alyxya|2 months ago

The hardest part about making a new architecture is that even if it is just better than transformers in every way, it’s very difficult to both prove a significant improvement at scale and gain traction. Until google puts in a lot of resources into training a scaled up version of this architecture, I believe there’s plenty of low hanging fruit with improving existing architectures such that it’ll always take the back seat.

root_axis|2 months ago

I don't think the comparison is valid. Releasing code and weights for an architecture that is widely known is a lot different than releasing research about an architecture that could mitigate fundamental problems that are common to all LLM products.

innagadadavida|2 months ago

Just keep in mind it is performance review time for all the tech companies. Their promotion of these seems to be directly correlated with that event.

SilverSlash|2 months ago

The newer one is from late May: https://arxiv.org/abs/2505.23735

mupuff1234|2 months ago

> it's been 11 months

Is that supposed to be a long time? Seems fair that companies don't rush to open up their models.

informal007|2 months ago

I don't think model code is a big deal compared to the idea. If public can recognize the value of idea 11 months ago, they could implement the code quickly because there are so much smart engineers in AI field.

AugSun|2 months ago

Gemini 3 _is_ that architecture.

hiddencost|2 months ago

Every Google publication goes through multiple review. If anyone thinks the publication is a competitor risk it gets squashed.

It's very likely no one is using this architecture at Google for any production work loads. There are a lot of student researchers doing fun proof of concept papers, they're allowed to publish because it's good PR and it's good for their careers.

jeffbee|2 months ago

Underrated comment, IMHO. There is such a gulf between what Google does on its own part, and the papers and source code they publish, that I always think about their motivations before I read or adopt it. Think Borg vs. Kubernetes, Stubby vs. gRPC.

hustwindmaple|2 months ago

The amazing thing about this is the first author has published multiple high-impact papers with Google Research VPs! And he is just a 2nd-year PhD student. Very few L7/L8 RS/SWEs can even do this.

Balinares|2 months ago

I mean, they did publish the word2vec and transformers papers, which are both of major significance to the development of LLMs.

bluecoconut|2 months ago

Bytedance is publishing pretty aggressively.

Recently, my favorite from them was lumine: https://arxiv.org/abs/2511.08892

Here's their official page: https://seed.bytedance.com/en/research

Hendrikto|2 months ago

Meta is also being pretty open with their stuff. And recently most of the Chinese competition.

okdood64|2 months ago

Oh yes, I believe that's right. What's some frontier research Meta has shared in the last couple years?

cubefox|2 months ago

The author is listed as a "student researcher", which might include a clause that students can publish their results.

Here is a bit more information about this program: https://www.google.com/about/careers/applications/jobs/resul...

asim|2 months ago

It was not always like this. Google was very secretive in the early days. We did not start to see things until the GFS, BigTable and Borg (or Chubby) papers in 2006 timeframe.

okdood64|2 months ago

By 2006, Google was 8 years old. OpenAI is now 10.

vlovich123|2 months ago

Google publishes detailed papers of its architecture once it’s built the next version.

AI is a bit different.

rcpt|2 months ago

Page Rank

embedding-shape|2 months ago

> Is there any other company that's openly publishing their research on AI at this level? Google should get a lot of credit for this.

80% of the ecosystem is built on top of companies, groups and individuals publishing their research openly, not sure why Google would get more credit for this than others...

govping|2 months ago

Working with 1M context windows daily - the real limitation isn't storage but retrieval. You can feed massive context but knowing WHICH part to reference at the right moment is hard. Effective long-term memory needs both capacity and intelligent indexing.

nickpsecurity|2 months ago

Arxiv is flooded with ML papers. Github has a lot of prototypes for them. I'd say it's pretty normal with some companies not sharing for perceived, competitive advantage. Perceived because it may or may not be real vs published prototypes.

We post a lot of research on mlscaling sub if you want to look back through them.

https://www.reddit.com/r/t5_3bzqh1/s/yml1o2ER33

timzaman|2 months ago

lol you don't get it. If it's published it means it's not very useful

okdood64|2 months ago

What about the Attention paper?

HarHarVeryFunny|2 months ago

Maybe it's just misdirection - a failed approach ?

Given the competitive nature of the AI race, it's hard to believe any of these companies are really trying to help the competition.

(no title)

discuss