The maximum die size is interesting but not really the point. The context is more the complexity and capability of the chip, for which transistor count is about as good a measure as you're going to fit in the headline. The immediate subheading jumps to telling you FLOPs which is another attempt at summarizing the capabilities of the chip quickly. Once you have the info that it's large and fast the body serves to provide the detailed context. From that view the title serves to identify the primary context well - a very complex chip, come read more about it.
One basic thing I didn't see in the body was power consumption though, anyone know more details on that?
> […] AI has a massive value to every aspect of the company's work.
That's also just wrong. During the recent "Tesla AI Day", when asked during Q/A, Elon Musk specifically mentioned that they intentionally use machine learning only for very few cases:
Q: "Is Tesla using machine learning within its manufacturing, design
or any other engineering processes?"
Elon: "I discourage use of machine learning, because it's really
difficult. Unless you have to use machine learning, don't do it. It's
usually a red flag when somebody is saying 'We wanna use machine
learning to solve this task'. I'm like: That sounds like bullshit.
99.9% of the time you don't need it."
IMO that first paragraph is great, especially for readers who may not have your level of industry knowledge and technical acumen. It efficiently contextualizes the article and addresses a common complaint that I often see even on HN — the failure to clearly answer "What is this and why does this matter?"
Density is partly a function of the type of circuit. Memory is denser than random logic, for instance. Interconnect eats a lot of area and reduces density.
This chip is largely memory and multipliers, both of which are pretty dense.
Fab processes improve over time to have higher density and lower defect rate (which allows bigger chips while getting acceptable yield). So it's not surprising to see a chip on the same node but shipping a year or 2 later (than Ampere) having more transistors.
I'm always curious about the decision-making progress when someone decides to make their own ASIC when there are somewhat reasonable commercial alternatives. What was the advantage here for Tesla?
Rolling your own ASIC makes sense if you need to churn out enormous quantities for your own use. The actual cost of fabrication is largely weighted toward non-recurring engineering costs. Once the printing press fires up, chips are very inexpensive.
Does Tesla need tens of thousands of these things?
I think I'm this case, the main advantage is controlling their own destiny when it comes to building the types of models they need.
I think in 25% of cases it will not get them significantly more performance vs Nvidia.
There is a 50% chance that they can outperform off the shelf chips by a significant amount to make it maybe worth it. (This is pretty likely because dedicated hardware tends to outperform general hardware).
However, there is maybe a 25% risk buying Nvidia doesn't get them there soon.
So building their own chips de-risks the worst case, and it's probably not that much more expensive (at Tesla scale). So seems like a pretty good bet to me.
With the recent buzz about AI semiconductor design and drive for domestic semiconductor manufacture they are likely positioning themselves to be capable of investing in the next generation of fabrication. Leverage with the government and potential funding.
Tesla actually has a lot of expertise in chip design in Pete Bannon and formerly Jim Keller. I think most people know who Jim Keller is, but if not you can read his wikipedia[1]. Pete Bannon is also an industry giant and worked with Jim Keller at PA Semi and subsequently Apple on their A series chips. These two have decades of experience designing chips that went into tens of millions of devices. Tesla’s FSD computer is in hundreds of thousands of cars. They know what they’re doing.
Question: how many chips does Tesla need to buy in order to get a reasonable unit price per chip? Obviously <10k is too small, but is 100k reasonable? 1M?
The whole point of the die on silicone seems to be that this maximizes the interface bandwidth and minimize latency between the dies. If this true the next step would be to bring the multi die modules as close as possible in three dimensions to ultimately build a borg-cube like structure in zero-g with a power source at its core.
I wonder how their neural network structures informed the hardware design, such as the dimensions of tensor products. Or is Dojo trying for as general purpose ML as possible? I imagine there is a tension between software and hardware teams where Karpathy's team is always changing things while the hardware team wants specs/reqs.
The "tiles of tiles" chip architecture seems like an Elon-obvious, let's just scale what we have approach. Do their neural networks map to that multiscale tiling well?
When comparing it to other large designs, I think it’s not exceptional, but also not in the back of the pack. This die is 645mm², or a square inch. We could create a wafer that size in the 1960s (https://en.wikipedia.org/wiki/Wafer_(electronics)#Standard_w.... Note these are for circular wafers, so a 1 inch wafer is about ¾ square inch), so in that sense, it isn’t a surprise that we can make such a chip.
So, the engineering is impressive, but not spectacular.
Also, this being a grid of interconnected CPUs means the design is simpler than a single design filling the entire die would be. It’s ‘just’ repeating the same design over and over (possibly with some small variations near the edge)
Of course looking at it without knowledge of the state of the art it is astounding that we can even think of constructing machines with 50 billion working parts
[+] [-] ksec|4 years ago|reply
Die Size is 645mm^2 on a 7nm. This is important because we know the reticle limit which is around ~800mm^2.
The Nvidia AI Chip has 54 billion transistors with a die size of 826 mm2 on 7nm.
I recently saw a Ted Talk, If Content is King, then Context is God. I think it capture everything that is wrong in today's society.
[+] [-] zamadatix|4 years ago|reply
One basic thing I didn't see in the body was power consumption though, anyone know more details on that?
[+] [-] abc_lisper|4 years ago|reply
[+] [-] TekMol|4 years ago|reply
What would Tom's Hardware lose if they left out this type of cheap fillwords?
Should I also start writing like this?
Is this type of "reader hostile writing" a new thing or have newspapers always written like this?
These are not rhetorical questions. I am honestly confused.
[+] [-] Dunedan|4 years ago|reply
That's also just wrong. During the recent "Tesla AI Day", when asked during Q/A, Elon Musk specifically mentioned that they intentionally use machine learning only for very few cases:
https://www.youtube.com/watch?v=j0z4FweCy4M&t=9307s[+] [-] CharlesW|4 years ago|reply
[+] [-] SmellTheGlove|4 years ago|reply
[+] [-] jliptzin|4 years ago|reply
[+] [-] Lio|4 years ago|reply
Honest question, how much is chip design a factor separate to fab process?
[+] [-] tlb|4 years ago|reply
This chip is largely memory and multipliers, both of which are pretty dense.
Fab processes improve over time to have higher density and lower defect rate (which allows bigger chips while getting acceptable yield). So it's not surprising to see a chip on the same node but shipping a year or 2 later (than Ampere) having more transistors.
[+] [-] greesil|4 years ago|reply
[+] [-] ttul|4 years ago|reply
Does Tesla need tens of thousands of these things?
[+] [-] mchusma|4 years ago|reply
I think in 25% of cases it will not get them significantly more performance vs Nvidia.
There is a 50% chance that they can outperform off the shelf chips by a significant amount to make it maybe worth it. (This is pretty likely because dedicated hardware tends to outperform general hardware).
However, there is maybe a 25% risk buying Nvidia doesn't get them there soon.
So building their own chips de-risks the worst case, and it's probably not that much more expensive (at Tesla scale). So seems like a pretty good bet to me.
[+] [-] taylorportman|4 years ago|reply
[+] [-] mirker|4 years ago|reply
[deleted]
[+] [-] minhazm|4 years ago|reply
https://en.wikipedia.org/wiki/Jim_Keller_(engineer)
[+] [-] mupuff1234|4 years ago|reply
[+] [-] millerm|4 years ago|reply
[+] [-] gautamcgoel|4 years ago|reply
[+] [-] ehsankia|4 years ago|reply
[+] [-] m3kw9|4 years ago|reply
[+] [-] sonium|4 years ago|reply
[+] [-] mrtnmcc|4 years ago|reply
The "tiles of tiles" chip architecture seems like an Elon-obvious, let's just scale what we have approach. Do their neural networks map to that multiscale tiling well?
[+] [-] jstandard|4 years ago|reply
The WSE2 is much larger obviously, but I would also think it can result in a large performance boost given everything is on a single chip.
[+] [-] throwaway4good|4 years ago|reply
It says TSMC 7nm - is that DUV or EUVL?
[+] [-] ttul|4 years ago|reply
[+] [-] tromp|4 years ago|reply
[+] [-] jeffbee|4 years ago|reply
[+] [-] Someone|4 years ago|reply
We couldn’t put 50B transistors on a square inch in the 1960s, though. We can now. https://en.wikipedia.org/wiki/Transistor_count lists several larger designs.
So, the engineering is impressive, but not spectacular.
Also, this being a grid of interconnected CPUs means the design is simpler than a single design filling the entire die would be. It’s ‘just’ repeating the same design over and over (possibly with some small variations near the edge)
Of course looking at it without knowledge of the state of the art it is astounding that we can even think of constructing machines with 50 billion working parts
[+] [-] throwaway4good|4 years ago|reply