I found this to be a pretty confusing article. I get that they are analyzing the new node, which is great, but the editorializing seems a bit premature to me.
I also don't think the author understood the TMSC presentation. TMSC clearly said that is used a "model" of a typical SOC of 60% logic, 30% SRAM, and 10% analog. Then they said that for each category of thing, you could expect 1.8x, 1.35x, and 1.2x of shrink. If you do the math, that means an overall shrink for a 'typical' SOC that conforms to their model would be 1.57x (approximately).
That Apple achieved 1.49x would suggest they got 95% of the process shrink effectiveness.
Then there is the cost per die and thus cost per transistor discussion. It is important to remember that this is likely the most expensive these wafers will ever be. The reasoning for that statement is that during a process node life-cycle the cost per wafer is set initially to capture "early movers" (who value density over cost). Much like any product where competition will emerge "later" there is a window early on to recapture value which can pay back your R&D and capital equipment investments. As a result the vendor sets the price as high as possible to make that repayment happen as quickly as possible. Once paid back, the price provides profit as long as it can be supported in the presence of other competitors (in this case, I would guess that role is played by Samsung). The GSA uses to publish price surveys of wafers on various nodes over time but they don't seem to do that any more. Anyway, so the cost per transistor will go down from this point but how much depends on how much margin is in the current wafer price.
So I agree that the cost per transistor is not going down as quickly as it has in the past, and its possible that this node may not get to be as low as the previous node. I'm curious how it compares when you look at 7nm introduction price per transistor vs todays price per transistor. And if you get the same ramp with the 5nm node what that would mean.
Apple is also probably using more SRAM than the "typical SOC", for example the A13 has much more cache than say the Snapdragon 855. I don't know the relative die areas but that would let you make a decent estimate.
TSMC has a monopoly on the highest performance processes - so the price won't drop till there is competition. Seeing how the competition is falling behind, that will be a while...
And the economic model now requires Leading Edge Fab to capture those value in a longer period of time. What used to be two years will be lengthen closer to three.
>I'm curious how it compares when you look at 7nm introduction price per transistor vs todays price per transistor. And if you get the same ramp with the 5nm node what that would mean.
First Gen N7 being ~10K+ per wafer while N5 being around ~13K with higher yield compare to N7 in the same stage. So N5 should still be cheaper.
Was I the only one disappointed in the decreasing die size? In our solid state past, it was about cramming more in there. All this cost-cutting leaving gaps... it wouldn’t pass the Jobs’ fish tank test.
I really appreciate this analysis and the straightforward top line number 134e6/mm^2. The usual "node" figure is utterly meaningless; an electrical engineer couldn't care less about "feature" size (whatever that means.) What is the count of discrete components in a given area? There are 40 billion 5nm (the supposed "node" of these chips) squares in a millimeter of area. That's two orders of magnitude more dense than 134 million.
The meaningful achievement is how many discrete electrical components are composed into a given area. Not some arbitrary dimension of some cherry picked subset of these components.
> The meaningful achievement is how many discrete electrical components are composed into a given area. Not some arbitrary dimension of some cherry picked subset of these components.
I disagree. The meaningful achievement is how power-efficient, fast, and cheap you can make a given chip. (Secondarily, how small and how durable wrt cosmic rays; but for most purposes these are not super important.)
If that follows as a result of many discrete electrical components being packed into a small area, great; but the latter isn't intrinsically interesting.
These things aren't _really_ two-dimensional. They're not really three-dimensional either, but they are objects built out of layers of two-dimensional things. When you measure number of transistors per unit area you will inevitably see something more dense than the "number of 5nm square in a millimeter of area".
It's the silicon equivalent of measuring one's BMI.
A cycle or two ago I asked this same question. Some people seemed to think that such a number would be gamed, but if that's a valid barrier to entry then we will never accomplish anything again, really.
As long as the measure is in mm^2, not mm^3, I think (hope?) that would avoid any perverse incentives against breakthroughs that allow you to add more layers to a chip and still maintain yields.
> The meaningful achievement is how many discrete electrical components are composed into a given area.
Is average over an area the meaningful achievement or is the meaningful achievement the smallest individual gate length? Neither are super useful without additional context.
"As of 2019, the highest transistor count in any IC chip is Samsung's 1 TB eUFS (3D-stacked) V-NAND flash memory chip, with 2 trillion floating-gate MOSFETs (4 bits per transistor)." [1]
The funny thing is is this isn't really Apple's achievement. It's TSMC's achievement. It's also Intel's failure in that they failed to get a similar process working on schedule.
I think this is cool if they’re the future of the Mac, but aren’t these chips just wasted in the iPad? I say this as an iPad Pro (1st-gen) owner ... it’s already way more processing power than I can really use, because after trying for months to get a sensible workflow going (mostly based around Pythonista, Editorial and RealVNC) I’ve relegated it to a OneNote and Netflix machine. What’s the point other than the coolness factor? Is it some future AR-type application?
I was just looking into optimising a large web application.
I told the customer that using a caching layer such as a CDN would help paper over the worst of the inefficiencies in their application and the network stack.
That was true! The download times halved.
However, benchmarks with F12 developer tools showed that while downloads reduced from 200ms to 100ms, the overall load times were still seconds, of which about 50% was actual CPU time.
(This is typical of large, complex sites built by non-experts or organisations where performance is not a primary concern.)
Web sites like these perform very noticeably better if you have more CPU cores or faster CPU cores.
Essentially, CDNs are easy to deploy and 5G provides nearly gigabit download speeds, so the bottleneck has shifted back to the CPU for a lot of the web.
I sold my 2080ti and am playing Eve Echoes (new, mobile version of EVE Online) and Among Us, and Civ VI on the 2nd gen iPad Pro I bought used with less than half of the proceeds. The battery lasts a good long time and it feels pretty good to be able to play these full sized games on a screen like this. When I'm not using it the toddler watches some kids shows, and with the keyboard case I can type almost as quickly as I can on my 15" Macbook pro. If it had XCode I'd be set, honestly.
My understanding from talking to friends who work at Apple and some of the product briefs is a lot of the hardware is used for video and photo processing. Video games take advantage of the extra CPU/GPU power as well.
I don’t think that’s all that Apple specific as there are increased ISPs and security chips on other phones as well.
I work on a x86 interpreter that is only usable because of Apple's massive investment into these chips over the last couple years. I doubt that is what Apple was designing them for, but they are very useful to have as an application developer.
Audio and Video work will take all the transistors you can throw at it. Things like GarageBand and 3rd party software synthesizers can easily max out even latest gen iPad Pros. Not to mention games…
Apple's SoC works really great and impressive, but I don't know why I need such spec for Phone. Even A12 is still great for most workloads. Apple improves camera functionality by its performance but just camera? I wish they find useful uses other than cameras.
I bought my iPad Pro because it offered the smoothest RAW photo proofing process of any product I've used. A lot of this is obviously up to software as well as hardware, but still - the beefy processor is useful here.
The article says that Apple isn't making full use of the 5nm process node, and blames lack of SRAM scaling for it (presumably due to the large amount of L3 cache). Is this a problem that all processors are about to hit, or is this going to be overcome once process engineers are more familiar with 5nm?
This bit didn't make a lot of sense to me. SRAM is routinely the MOST optimized and MOST worried-over aspect of silicon layout. SRAM cell architectures get hand-optimized carefully years in advance of any process improvements. They aren't just logic that gets spit out of generic tooling.
So while it would make sense that a new process would have new design rule that didn't map well to older EDA tooling, it's harder to see that argument holding for SRAM. If SRAM isn't scaling, why is general logic?
They optimize SRAM far in advance of introducing the process, after all this is one of the most critical components. It's very unlikely you'll see any further SRAM improvement on this process, but a process optimization can improve it (and everything else).
As someone with an 8 year old MacBook Pro that isn’t all that much slower than my 2019 16”.... I am so glad Apple is still pushing the envelope.
Back in the old days we used to get a doubling of performance every year or two. Then Intel got into trouble and has been producing the same chips for what 4-5 years? Essentially just minor changes.
It’s not really Apple’s fault that the rest of the industry is lagging so badly on performance. My Nokia was fast enough, of course no one “needs” a faster phone. But it only appears so much faster because the rest of the industry has stagnated.
Had Intel kept their tick-tock cadence then these new "faster” phones would hardly be considered fast.
My 2013 high end 15" MBP gets pretty hot when I'm watching 1080p or higher videos. Chrome+YouTube is even worse, it even drops frames. Compiling Swift is even worse than that.
I have a work-provided 16" MBP that does all these things effortlessly.
I was curious and looked up how this compared to the computer my family had 20 years ago: a Dell XPS tower with a Pentium III. The A14 has more than 1,000x as many transistors as what Intel was doing 20 years ago on a 128mm^2 die. Pretty incredible.
I was curious just how many chips per wafer they net. From this 2018 article[1], it appears to be ~530 at 5nm, which would be $32ea if that yield is accurate. The same article estimated 7nm chips came out to $18ea.
Apparently, R&D spend went up 50% from 7nm to 5nm. I'm curious to see how many flavors of the A14 Apple cooks up given the comparatively high die cost.
What's the relative heat dissipation between logic & SRAM on a chip? The article talks about layering as a potential way forward for SRAM, but that would come with more complex TDP management, unless SRAM isn't burning watts at the same rate as the rest of the chip.
For comparison I was trying to nail down what Intel's 7nm process is, but I've found answers between 120nm and 215nm. If anyone knows, that'd be an interesting comparison.
[+] [-] ChuckMcM|5 years ago|reply
I also don't think the author understood the TMSC presentation. TMSC clearly said that is used a "model" of a typical SOC of 60% logic, 30% SRAM, and 10% analog. Then they said that for each category of thing, you could expect 1.8x, 1.35x, and 1.2x of shrink. If you do the math, that means an overall shrink for a 'typical' SOC that conforms to their model would be 1.57x (approximately).
That Apple achieved 1.49x would suggest they got 95% of the process shrink effectiveness.
Then there is the cost per die and thus cost per transistor discussion. It is important to remember that this is likely the most expensive these wafers will ever be. The reasoning for that statement is that during a process node life-cycle the cost per wafer is set initially to capture "early movers" (who value density over cost). Much like any product where competition will emerge "later" there is a window early on to recapture value which can pay back your R&D and capital equipment investments. As a result the vendor sets the price as high as possible to make that repayment happen as quickly as possible. Once paid back, the price provides profit as long as it can be supported in the presence of other competitors (in this case, I would guess that role is played by Samsung). The GSA uses to publish price surveys of wafers on various nodes over time but they don't seem to do that any more. Anyway, so the cost per transistor will go down from this point but how much depends on how much margin is in the current wafer price.
So I agree that the cost per transistor is not going down as quickly as it has in the past, and its possible that this node may not get to be as low as the previous node. I'm curious how it compares when you look at 7nm introduction price per transistor vs todays price per transistor. And if you get the same ramp with the 5nm node what that would mean.
[+] [-] hedgehog|5 years ago|reply
[+] [-] londons_explore|5 years ago|reply
[+] [-] ksec|5 years ago|reply
>I'm curious how it compares when you look at 7nm introduction price per transistor vs todays price per transistor. And if you get the same ramp with the 5nm node what that would mean.
First Gen N7 being ~10K+ per wafer while N5 being around ~13K with higher yield compare to N7 in the same stage. So N5 should still be cheaper.
[+] [-] thornjm|5 years ago|reply
[+] [-] disillusionmnt|5 years ago|reply
[+] [-] topspin|5 years ago|reply
The meaningful achievement is how many discrete electrical components are composed into a given area. Not some arbitrary dimension of some cherry picked subset of these components.
[+] [-] moonchild|5 years ago|reply
I disagree. The meaningful achievement is how power-efficient, fast, and cheap you can make a given chip. (Secondarily, how small and how durable wrt cosmic rays; but for most purposes these are not super important.)
If that follows as a result of many discrete electrical components being packed into a small area, great; but the latter isn't intrinsically interesting.
[+] [-] coherentpony|5 years ago|reply
It's the silicon equivalent of measuring one's BMI.
[+] [-] hinkley|5 years ago|reply
As long as the measure is in mm^2, not mm^3, I think (hope?) that would avoid any perverse incentives against breakthroughs that allow you to add more layers to a chip and still maintain yields.
[+] [-] eightysixfour|5 years ago|reply
Is average over an area the meaningful achievement or is the meaningful achievement the smallest individual gate length? Neither are super useful without additional context.
[+] [-] m463|5 years ago|reply
How about bit count? (at 4 bits per transistor)
[1] https://en.wikipedia.org/wiki/Transistor_count
[+] [-] narrator|5 years ago|reply
[+] [-] Nokinside|5 years ago|reply
Even that number is not so straightforward.
Tr/mm^2 = 0.6 * (NAND2 Tr)/mm^2 + 0.4 * (Scan Flip-fop Tr)/mm2
[+] [-] talentedcoin|5 years ago|reply
[+] [-] jiggawatts|5 years ago|reply
I told the customer that using a caching layer such as a CDN would help paper over the worst of the inefficiencies in their application and the network stack.
That was true! The download times halved.
However, benchmarks with F12 developer tools showed that while downloads reduced from 200ms to 100ms, the overall load times were still seconds, of which about 50% was actual CPU time.
(This is typical of large, complex sites built by non-experts or organisations where performance is not a primary concern.)
Web sites like these perform very noticeably better if you have more CPU cores or faster CPU cores.
Essentially, CDNs are easy to deploy and 5G provides nearly gigabit download speeds, so the bottleneck has shifted back to the CPU for a lot of the web.
[+] [-] wincy|5 years ago|reply
[+] [-] manmal|5 years ago|reply
- Less heat produced > allows for smaller heatsinks and, as a result, smaller devices or bigger components
- Performance bottlenecks become less likely
- Features like 120hz display refresh rate become possible without the user noticing degradation
- Faster OS boot and app starts
- Ability to do more work locally vs using a server (see the many ML features)
- Tech debt in system code is less of an issue - code can be shipped early and optimized later
[+] [-] Moto7451|5 years ago|reply
I don’t think that’s all that Apple specific as there are increased ISPs and security chips on other phones as well.
[+] [-] saagarjha|5 years ago|reply
[+] [-] kitotik|5 years ago|reply
[+] [-] samatman|5 years ago|reply
Also, one can play games on an iPad, and games will soak up arbitrary amounts of CPU and GPU power.
[+] [-] urthor|5 years ago|reply
There's a lot of corner cases there where certain consumers value it a lot.
Also performance per watt is a huge deal. The expanded power envelope that improved PPW brings allows 120hz displays which are battery hogs.
[+] [-] Cookingboy|5 years ago|reply
So Genshin Impact can look and run beautifully XD
[+] [-] fomine3|5 years ago|reply
[+] [-] centimeter|5 years ago|reply
[+] [-] unknown|5 years ago|reply
[deleted]
[+] [-] megablast|5 years ago|reply
[+] [-] vmception|5 years ago|reply
[+] [-] threeseed|5 years ago|reply
[+] [-] jagged-chisel|5 years ago|reply
[+] [-] gettingsnarky|5 years ago|reply
[deleted]
[+] [-] rdw|5 years ago|reply
[+] [-] ajross|5 years ago|reply
So while it would make sense that a new process would have new design rule that didn't map well to older EDA tooling, it's harder to see that argument holding for SRAM. If SRAM isn't scaling, why is general logic?
[+] [-] FullyFunctional|5 years ago|reply
[+] [-] unknown|5 years ago|reply
[deleted]
[+] [-] hmottestad|5 years ago|reply
Back in the old days we used to get a doubling of performance every year or two. Then Intel got into trouble and has been producing the same chips for what 4-5 years? Essentially just minor changes.
It’s not really Apple’s fault that the rest of the industry is lagging so badly on performance. My Nokia was fast enough, of course no one “needs” a faster phone. But it only appears so much faster because the rest of the industry has stagnated.
Had Intel kept their tick-tock cadence then these new "faster” phones would hardly be considered fast.
[+] [-] raydev|5 years ago|reply
I have a work-provided 16" MBP that does all these things effortlessly.
[+] [-] jagger27|5 years ago|reply
[+] [-] nimish|5 years ago|reply
[+] [-] throwaway4good|5 years ago|reply
https://en.wikipedia.org/wiki/Transistor_count
Has a big table listing a number of historical CPUs.
[+] [-] pwinnski|5 years ago|reply
[+] [-] tpowell|5 years ago|reply
Apparently, R&D spend went up 50% from 7nm to 5nm. I'm curious to see how many flavors of the A14 Apple cooks up given the comparatively high die cost.
[1] https://wccftech.com/apple-5nm-3nm-cost-transistors/
[+] [-] tpmx|5 years ago|reply
See e.g. https://www.cultofmac.com/658650/iphone-11-pro-max-productio...
> Apple A13 processor that (is) supposedly is priced at $64.
[+] [-] amelius|5 years ago|reply
Can't their other clients reach the same density?
[+] [-] ineedasername|5 years ago|reply
[+] [-] exabrial|5 years ago|reply
[+] [-] hellokaushik|5 years ago|reply
[+] [-] baron816|5 years ago|reply
[+] [-] fuck_red_china|5 years ago|reply
[deleted]