top | item 43661807

(no title)

thunderbird120 | 10 months ago

This article doesn't mention TPUs anywhere. I don't think it's obvious for people outside of google's ecosystem just how extraordinarily good the JAX + TPU ecosystem is. Google several structural advantages over other major players, but the largest one is that they roll their own compute solution which is actually very mature and competitive. TPUs are extremely good at both training and inference[1] especially at scale. Google's ability to tailor their mature hardware to exactly what they need gives them a massive leg up on competition. AI companies fundamentally have to answer the question "what can you do that no one else can?". Google's hardware advantage provides an actual answer to that question which can't be erased the next time someone drops a new model onto huggingface.

[1]https://blog.google/products/google-cloud/ironwood-tpu-age-o...

discuss

order

marcusb|10 months ago

From the article:

> I’m forgetting something. Oh, of course, Google is also a hardware company. With its left arm, Google is fighting Nvidia in the AI chip market (both to eliminate its former GPU dependence and to eventually sell its chips to other companies). How well are they doing? They just announced the 7th version of their TPU, Ironwood. The specifications are impressive. It’s a chip made for the AI era of inference, just like Nvidia Blackwell

thunderbird120|10 months ago

Nice to see that they added that, but that section wasn't in the article when I wrote that comment.

krackers|10 months ago

Assuming that DeepSeek continues to open-source, then we can assume that in the future there won't be any "secret sauce" in model architecture. Only data and training/serving infrastructure, and Google is in a good position with regard to both.

jononor|10 months ago

Google is also in a great position wrt distribution - to get users at scale, and attach to pre-existing revenue streams. Via Android, Gmail, Docs, Search - they have a lot of reach. YouTube as well, though fit there is maybe less obvious. Combined with the two factors you mention, and the size of their warchest - they are really excellently positioned.

fulafel|10 months ago

Making your own hardware would seem to yield freedoms in model architectures as well since performance is closely related to how the model architecture fits the hardware.

chermi|10 months ago

Huh? I don't think it's that simple. As far as we know, everyone has some secret sauce. You're assuming deepseek will find all of that.

spwa4|10 months ago

... except that it still pretty much requires Nvidia hardware. Maybe not for edge inference, but even inference at scale (ie. say at companies, or governments) will still require it.

mike_hearn|10 months ago

TPUs aren't necessarily a pro. They go back 15 years and don't seem to have yielded any kind of durable advantage. Developing them is expensive but their architecture was often over-fit to yesterday's algorithms which is why they've been through so many redesigns. Their competitors have routinely moved much faster using CUDA.

Once the space settles down, the balance might tip towards specialized accelerators but NVIDIA has plenty of room to make specialized silicon and cut prices too. Google has still to prove that the TPU investment is worth it.

summerlight|10 months ago

Not sure how familiar you are with the internal situation... But from my experience think it's safe to say that TPU basically multiplies Google's computation capability by 10x, if not 20x. Also they don't need to compete with others to secure expensive nvidia chips. If this is not an advantage, I don't see there's anything considered to be an advantage. The entire point of vertical integration is to secure full control of your stack so your capability won't be limited by potential competitors, and TPU is one of the key component of its strategy.

Also worth noting that its Ads division is the largest, heaviest user of TPU. Thanks to it, it can flex running a bunch of different expensive models that you cannot realistically afford with GPU. The revenue delta from this is more than enough to pay off the entire investment history for TPU.

alienthrowaway|10 months ago

> Developing them is expensive

So are the electric and cooling costs at Google's scale. Improving perf-per-watt efficiency can pay for itself. The fact that they keep iterating on it suggests it's not a negative-return exercise.

foota|10 months ago

Haven't Nvidia published roughly as many chip designs in the same period?

dgacmu|10 months ago

They go back about 11 years.

imtringued|10 months ago

Google is what everyone thinks OpenAI is.

Google has their own cloud with their data centers with their own custom designed hardware using their own machine learning software stack running their in-house designed neural networks.

The only thing Google is missing is designing a computer memory that is specifically tailored for machine learning. Something like processing in memory.

ENGNR|10 months ago

The one thing they lack that OpenAI has is… product focus. There’s some kind of management issue that makes Google all over the shop, cancelling products for no reason. Whereas Sam Altmans team is right on the money.

Google is catching up fast on product though.

jxjnskkzxxhx|10 months ago

I've used Jax quite a bit and it's so much better than tf/pytorch.

Now for the life of me, I still haven't been able to understan what a TPU is. Is it Google's marketing term for a GPU? Or is it something different entirely?

mota7|10 months ago

There's basically a difference in philosophy. GPU chips have a bunch of cores, each of which is semi-capable, whereas TPU chips have (effectively) one enormous core.

So GPUs have ~120 small systolic arrays, one per SM (aka, a tensorcore), plus passable off-chip bandwidth (aka 16 lines of PCI).

Where has TPUs have one honking big systolic array, plus large amounts of off-chip bandwidth.

This roughly translates to GPUs being better if you're doing a bunch of different small-ish things in parallel, but TPUs are better if you're doing lots of large matrix multiplies.

317070|10 months ago

Way back when, most of a GPU was for graphics. Google decided to design a completely new chip, which focused on the operations for neural networks (mainly vectorized matmul). This is the TPU.

It's not a GPU, as there is no graphics hardware there anymore. Just memory and very efficient cores, capable of doing massively parallel matmuls on the memory. The instruction set is tiny, basically only capable of doing transformer operations fast.

Today, I'm not sure how much graphics an A100 GPU still can do. But I guess the answer is "too much"?

JLO64|10 months ago

TPUs (short for Tensor Processing Units) are Google’s custom AI accelerator hardware which are completely separate from GPUs. I remember that introduced them in 2015ish but I imagine that they’re really starting to pay off with Gemini.

https://en.wikipedia.org/wiki/Tensor_Processing_Unit

albert_e|10 months ago

Amazon also invests in own hardware and silicon -- the Inferentia and Trainium chips for example.

But I am not sure how AWS and Google Cloud match up in terms of making this verticial integration work for their competitive advantage.

Any insight there - would be curious to read up on.

I guess Microsoft for that matter also has been investing -- we heard about the latest quantum breakthrough that was reported as creating a fundamenatally new physical state of matter. Not sure if they also have some traction with GPUs and others with more immediate applications.

chazeon|10 months ago

I think Amazon, Meta have been trying on inference hardware, they throw their hands up on training; but TPUs can actually be used in training, based on what I saw in Google’s colab.

6510|10 months ago

The problem is always their company never the product. They had countless great products. You cant depend on a product if the company is reliably unreliable enough. If they don't simply delete it for being expensive and "unprofitable" they might initially win, eventually, like search and youtube, it will be so watered down you cant taste the wine.

AlbertoRomGar|10 months ago

I am the author of the article. It was there since the beginning, just behind the paywall, which I removed due to the amount of interest the topic was receiving.

noosphr|10 months ago

And yet google's main structural disadvantage is being google.

Modern BERT with the extended context has solved natural language web search. I mean it as no exaggeration that _everything_ google does for search is now obsolete. The only reason why google search isn't dead yet is that it takes a while to index all web paged into a vector database.

And yet it wasn't google that released the architecture update, it was hugging face as a summer collaboration between a dozen people. Google's version came out in 2018 and languished for a decade because it would destroy their business model.

Google is too risk averse to do anything, but completely doomed if they don't cannibalize their cash cow product. Web search is no longer a crown jewel, but plumbing that answering services, like perplexity, need. I don't see google being able to pull off an iPhone moment where they killed the iPod to win the next 20 years.

visarga|10 months ago

> Modern BERT with the extended context has solved natural language web search. I mean it as no exaggeration that _everything_ google does for search is now obsolete.

The web UI for people using search may be obsolete, but search is hot, all AIs need it, both web and local. It's because models don't have recent information in them and are unable to reliably quote from memory.

petesergeant|10 months ago

> Google is too risk averse to do anything, but completely doomed if they don't cannibalize their cash cow product.

Google's cash-cow product is relevant ads. You can display relevant ads in LLM output or natural language web-search. As long as people are interacting with a Google property, I really don't think it matters what that product is, as long as there are ad views. Also:

> Web search is no longer a crown jewel, but plumbing that answering services, like perplexity, need

This sounds like a gigantic competitive advantage if you're selling AI-based products. You don't have to give everyone access to the good search via API, just your inhouse AI generator.

danpalmer|10 months ago

This would be like claiming in 2010 that because Page Rank is out there, search is a solved problem and there’s no secret sauce, and the following decade proved that false.

jampekka|10 months ago

> Modern BERT with the extended context has solved natural language web search.

I doubt this. Embedding models are no panacea even with a lot simpler retrieval tasks like RAG.

podnami|10 months ago

Do we have insights on whether they knew that their business model was at risk? My understanding is that OpenAI’s credibility lies in seeing the potential of scaling up a transformer-based model and that Google was caught off guard.

marsten|10 months ago

I think what may save Google from an Innovator's Dilemma extinction is that none of the AI would-be Google killers (OpenAI etc.) have figured out how to achieve any degree of lock-in. We're in a phase right now where everybody gets excited by the latest model and the switching cost is next to zero. This is very different from the dynamics of, say, Intel missing the boat on mobile CPUs.

I've been wondering for some time what sustainable advantage will end up looking like in AI. The only obvious thing is that whoever invents an AI that can remember who you are and every conversation it's had with you -- that will be a sticky product.

dash2|10 months ago

They can just plug the google.com web page into their AI. They already do that.

acstorage|10 months ago

Unclear if they can actually beat GPUs in training throughout with 4D parallelism

retinaros|10 months ago

they re not alone to do that tho.. aws also does and I believe microsoft is into it too