top | item 42849098

(no title)

amgreg | 1 year ago

> NVIDIA is the only viable seller of shovels for this gold rush for everyone but Google and Anthropic.

Why do you except Google and Anthropic?

discuss

Google makes its own hardware, they are vertical integrated .Dont know about Antrophic

ein0p|1 year ago

Anthropic uses a ton of TPU in addition to GPU, so presumably has the expertise to use both, and shift workloads as needed. Note that large scale TPU pretty much means Jax and not just "platform independent" flavor of Jax but Jax with TPU-specific optimizations.

mike_hearn|1 year ago

Anthropic are the only (?) heavy users of Amazon's chips. Or maybe they aren't heavy users. It's hard to say, they use NVIDIA too. Amazon is a big investor.

ein0p|1 year ago

Amazon's chips at this point are marketing for Amazon. I've seen the benchmarks, they're not quite ready for serious use yet. I suspect Anthropic got a good discount on GPUs in return for using Amazon's own chips in any possible capacity (or maybe just for the press release claiming such use). The only real alternative to NVIDIA on the inference side that you can actually buy hardware for is Intel Gaudi which costs less and performs rather well, but everyone seems to have written it off, along with Intel itself, and it's not available in any cloud last I checked. On the training side there's really no alternative at all - PyTorch is the de-facto standard, and while there is PyTorch XLA, it's even less popular than Jax, which is already like 20x less popular than PyTorch. Bottom line: capable Jax engineers able to optimize distributed Jax programs on TPUs are unobtainable unicorns for anyone but the top labs and Google itself. Note that the training side has significantly different requirements than inference side. Inference side is much simpler to optimize and wring the performance out of.