top | item 45274610

(no title)

mrktf | 5 months ago

As long as only TMSC is only top performance chip producer and it is possible to reserve all it manufacturing capacity for one two clients the NVIDIA will hold without problem...

My opinion, the problems for NVIDIA will start when China ramp up internal chip manufacturing performance enough to be in same order of magnitude as TMSC.

discuss

impossiblefork|5 months ago

But all sorts of people get their things fabbed by TSMC.

Cerebras get their chipped fabbed by them. I assume Eucyld will have their chips fabbed by them.

If there's orders, why would they prefer NVIDIA? Customer diversity is good, is it not?

nebula8804|5 months ago

TSMC and NVIDIA's relationship has gone back for more than 20 years. In the NVIDIA biography they talk about how TSMC really helped NVIDIA out early on when other suppliers just couldn't meet the quality and rate demands that NVIDIA aspired to. That has led to a strong relationship where both sides have really helped each other out.

re-thc|5 months ago

> If there's orders, why would they prefer NVIDIA? Customer diversity is good, is it not?

Money talks. Apple asked for first dips a while earlier (exclusively).

user34283|5 months ago

I'm not knowledgeable about this, but I wonder how important performance really is here.

Wont it be enough to just solder on a large amount of high bandwidth memory and produce these cards relatively cheaply?

alephnerd|5 months ago

> but I wonder how important performance really is here.

Perf is important, but ime American MLEs are less likely to investigate GPU and OS internals to get maximum perf, and just throw money at the problem.

> solder on a large amount of high bandwidth memory and produce these cards relatively cheaply

HBM is somewhat limited in China as well. CXMT is around 3-4 years behind other HBM vendors.

That said, you don't need the latest and most performant GPUs if you can tune older GPUs and parallelize training at a large scale.

-----------

IMO, Model training is an embarrassingly parallel problem, and a large enough cluster leveraging 1-2 generation older architectures that is heavily tuned should be able to provide similar performance to train models.

This is why I bemoan America's failures at OS internals and systems education. You have entire generations of "ML Engineers" and researchers in the US who don't know their way around CUDA or Infiniband optimization or the ins-and-outs of the Linux kernel.

They're just boffins who like math and using wrappers.

That said, I'd be cautious to trust a press release or secondhand report from CCTV, especially after the Kirin 9000 saga and SMIC.

But arguably, it doesn't matter - even if Alibaba's system isn't comparably performant to an H20, if it can be manufactured at scale without eating Nvidia's margins, it's good enough.

TylerE|5 months ago

Isn’t memory production relatively limited also?

TSiege|5 months ago

They are currently doing this. It’s part of their Made in China 2025 plan