This article doesn't mention TPUs anywhere. I don't think it's obvious for people outside of google's ecosystem just how extraordinarily good the JAX + TPU ecosystem is. Google several structural advantages over other major players, but the largest one is that they roll their own compute solution which is actually very mature and competitive. TPUs are extremely good at both training and inference[1] especially at scale. Google's ability to tailor their mature hardware to exactly what they need gives them a massive leg up on competition. AI companies fundamentally have to answer the question "what can you do that no one else can?". Google's hardware advantage provides an actual answer to that question which can't be erased the next time someone drops a new model onto huggingface.[1]https://blog.google/products/google-cloud/ironwood-tpu-age-o...
marcusb|10 months ago
> I’m forgetting something. Oh, of course, Google is also a hardware company. With its left arm, Google is fighting Nvidia in the AI chip market (both to eliminate its former GPU dependence and to eventually sell its chips to other companies). How well are they doing? They just announced the 7th version of their TPU, Ironwood. The specifications are impressive. It’s a chip made for the AI era of inference, just like Nvidia Blackwell
thunderbird120|10 months ago
krackers|10 months ago
jononor|10 months ago
fulafel|10 months ago
chermi|10 months ago
spwa4|10 months ago
mike_hearn|10 months ago
Once the space settles down, the balance might tip towards specialized accelerators but NVIDIA has plenty of room to make specialized silicon and cut prices too. Google has still to prove that the TPU investment is worth it.
summerlight|10 months ago
Also worth noting that its Ads division is the largest, heaviest user of TPU. Thanks to it, it can flex running a bunch of different expensive models that you cannot realistically afford with GPU. The revenue delta from this is more than enough to pay off the entire investment history for TPU.
alienthrowaway|10 months ago
So are the electric and cooling costs at Google's scale. Improving perf-per-watt efficiency can pay for itself. The fact that they keep iterating on it suggests it's not a negative-return exercise.
foota|10 months ago
dgacmu|10 months ago
imtringued|10 months ago
Google has their own cloud with their data centers with their own custom designed hardware using their own machine learning software stack running their in-house designed neural networks.
The only thing Google is missing is designing a computer memory that is specifically tailored for machine learning. Something like processing in memory.
ENGNR|10 months ago
Google is catching up fast on product though.
jxjnskkzxxhx|10 months ago
Now for the life of me, I still haven't been able to understan what a TPU is. Is it Google's marketing term for a GPU? Or is it something different entirely?
mota7|10 months ago
So GPUs have ~120 small systolic arrays, one per SM (aka, a tensorcore), plus passable off-chip bandwidth (aka 16 lines of PCI).
Where has TPUs have one honking big systolic array, plus large amounts of off-chip bandwidth.
This roughly translates to GPUs being better if you're doing a bunch of different small-ish things in parallel, but TPUs are better if you're doing lots of large matrix multiplies.
317070|10 months ago
It's not a GPU, as there is no graphics hardware there anymore. Just memory and very efficient cores, capable of doing massively parallel matmuls on the memory. The instruction set is tiny, basically only capable of doing transformer operations fast.
Today, I'm not sure how much graphics an A100 GPU still can do. But I guess the answer is "too much"?
JLO64|10 months ago
https://en.wikipedia.org/wiki/Tensor_Processing_Unit
albert_e|10 months ago
But I am not sure how AWS and Google Cloud match up in terms of making this verticial integration work for their competitive advantage.
Any insight there - would be curious to read up on.
I guess Microsoft for that matter also has been investing -- we heard about the latest quantum breakthrough that was reported as creating a fundamenatally new physical state of matter. Not sure if they also have some traction with GPUs and others with more immediate applications.
chazeon|10 months ago
6510|10 months ago
AlbertoRomGar|10 months ago
noosphr|10 months ago
Modern BERT with the extended context has solved natural language web search. I mean it as no exaggeration that _everything_ google does for search is now obsolete. The only reason why google search isn't dead yet is that it takes a while to index all web paged into a vector database.
And yet it wasn't google that released the architecture update, it was hugging face as a summer collaboration between a dozen people. Google's version came out in 2018 and languished for a decade because it would destroy their business model.
Google is too risk averse to do anything, but completely doomed if they don't cannibalize their cash cow product. Web search is no longer a crown jewel, but plumbing that answering services, like perplexity, need. I don't see google being able to pull off an iPhone moment where they killed the iPod to win the next 20 years.
visarga|10 months ago
The web UI for people using search may be obsolete, but search is hot, all AIs need it, both web and local. It's because models don't have recent information in them and are unable to reliably quote from memory.
petesergeant|10 months ago
Google's cash-cow product is relevant ads. You can display relevant ads in LLM output or natural language web-search. As long as people are interacting with a Google property, I really don't think it matters what that product is, as long as there are ad views. Also:
> Web search is no longer a crown jewel, but plumbing that answering services, like perplexity, need
This sounds like a gigantic competitive advantage if you're selling AI-based products. You don't have to give everyone access to the good search via API, just your inhouse AI generator.
danpalmer|10 months ago
jampekka|10 months ago
I doubt this. Embedding models are no panacea even with a lot simpler retrieval tasks like RAG.
podnami|10 months ago
marsten|10 months ago
I've been wondering for some time what sustainable advantage will end up looking like in AI. The only obvious thing is that whoever invents an AI that can remember who you are and every conversation it's had with you -- that will be a sticky product.
dash2|10 months ago
acstorage|10 months ago
unknown|10 months ago
[deleted]
retinaros|10 months ago