top | item 7501519

Apple's Cyclone Microarchitecture Detailed

267 points| pinaceae | 12 years ago |anandtech.com | reply

112 comments

order
[+] fidotron|12 years ago|reply
Apple almost certainly have a MacBook Air based on one of these chips at prototype stage.

There are many reasons in favour, including keeping negotiations with Intel over price interesting, but probably the main thing preventing them running with it would be an inability to manufacture the chips fast enough, which is a major concern in mobile land. This is why lots of Android devices exist in variations based on different chips as hedges by the device manufacturer on long term availability.

[+] mx12|12 years ago|reply
I don't think that Apple will completely get away from x86 for a long time. Attempting to emulate the x86 on an Arm would be terribly slow.

I do however, think that they will eventually include both arm and x86 processors in the Macbook Air. That way, backwards compatibility is preserved and low power apps can run on the arm. In the current Macbook Pros they have dynamic switching of GPUs, there's no reason they couldn't use the arm as a coprocessor or even run the full OS.

Here a few technical points:

* LLVM - You can compile once for both both architecture and then the JIT will take over compiling for the specific architecture (Take a look at llvm for the OpenGL pipe line in OSX)

Full screen app- When an app is full screen (if it's a low power app) then the x86 could sleep and the arm could switch to running the app.

App nap - If all x86 apps are a sleep, switch over the running the arm processor exclusively

*Saving State - It's possible to save the state of apps in OSX, a similar mechanism could be used to seamlessly migrated between processors.

This is pure speculation, but it is feasible. There would be many technical challenges that Apple would have to solve but the a capable. The advantage Apple has is that they have absolute control over both platforms.

[+] sliverstorm|12 years ago|reply
I would think porting OSX to ARM would be the real hurdle? Not undoable by any means, but the lion's share of the challenge.
[+] lallysingh|12 years ago|reply
If they wanted to, sure. They have all the infrastructure ("fat binaries") from the ppc switch.
[+] BugBrother|12 years ago|reply
The volumes for desktop/laptop are almost trivial compared to the volumes for handheld. (At least for Apple.) So making the processors in volume for a Macbook Air would probably not be a problem.

Also, Apple probably have little problem getting preferential treatment from manufacturers.

[+] sanxiyn|12 years ago|reply
"Cyclone is a wide machine. It can decode, issue, execute and retire up to 6 instructions/micro-ops per clock."

This is truly amazing. In comparison, Cortex-A15 can issue 3 instructions per cycle, for example.

[+] higherpurpose|12 years ago|reply
Nvidia's Denver is said to have 7. It's also rumored to have 2x the performance of Cortex A15. If it's true it could reach Sandy Bridge performance level either at first or 2nd gen (on FinFET 16nm) and Haswell/Broadwell level by 3rd gen (most people don't realize the performance between Haswell/Broadwell compared to Sandy Bridge is only like 15-20 percent, since Intel stopped focusing on performance). I also think in 2nd gen Denver SoC (2 process nodes from Tegra K1), Nvidia will have at least a 1 Tflops, possibly 1.2 Tflops GPU (Xbox One level).
[+] userbinator|12 years ago|reply
Keyword being "up to". There's a reason Intel hasn't really upped the issue width much, despite x86 having higher code density than ARM -- memory bandwidth. It's hard to keep wide execution cores saturated.
[+] marcosdumay|12 years ago|reply
Whatching those Mill presentations just spoiled me...
[+] jpwright|12 years ago|reply
Cortex A-15 can also run at nearly double the clock rate though: 2.5 GHz vs 1.4 GHz.
[+] watersb|12 years ago|reply
Anand also taught us that an iPad draws about as much power as an 11-inch MacBook Air.

Configure an ARM platform with 8 GB of DDR3 DRAM and a PCIe-based SSD, and you may well blow the power budget versus the current Intel platform design.

Intel has turned its attention to total platform power: moving the VRMs on-die, and also identifying third-party motherboard components which draw silly amounts of power.

I think that Intel is on track here. Apple will of course push their design talent in order to deliver the best mobile devices they possibly can. But this does not necessarily mean a move away from Intel for a laptop or desktop.

[+] tdicola|12 years ago|reply
Very cool analysis of the new chip. If I were Intel I would be more than a little concerned about what Apple is up to here. Would be very interesting to see what Cyclone can do in a proper laptop/desktop with more RAM.
[+] gatehouse|12 years ago|reply
I'm impressed with Apple's work and it reaches all the way down to the user experience (I can play GTA San Andreas on my 5s), but the pace of Intel's research over the last 10 years has been absolutely demonic. I think for Apple to compete on that front would require an extreme investment.

For example see the second chart here: http://preshing.com/20120208/a-look-back-at-single-threaded-...

[+] jobu|12 years ago|reply
It really seems like Apple is ahead of the curve on mobile R&D. I wonder if they'll ever consider reselling some of these chips. It's unlikely Steve Jobs would've ever done it, but I'm not so sure about Tim Cook.
[+] kevinchen|12 years ago|reply
Apple bought PA Semi so they could be in charge of chip design and not have to share with anyone else. It's a competitive advantage, and Tim Cook isn't stupid. He won't license away a crown jewel for a few short-term dollars.
[+] rsynnott|12 years ago|reply
People mightn't want to buy them if they did. These chips are _big_, which means that they're expensive to make. Apple can get away with it because their only costs are manufacture and licensing, and because they have high margins which can absorb a bit of a hit. If they were selling them, though, they'd presumably want to make a profit, and that profit would make the end product almost certainly the most expensive mobile chip on the market.

From a marketing point of view, too, it'd be a hard sell in Android-land. Apple has been very careful to steer clear of spec-oriented marketing, but can you imagine the less-sophisticated enthusiast market's response to, say, the Galaxy S6 using a dual-core 1.3GHz chip instead of a quad-core 2GHz?

[+] rmrfrmrf|12 years ago|reply
IMO it's unlikely. It seems like Apple has to go through a lot to meet their own manufacturing needs, so ramping up production just for third parties would put even more strain on the already-large demand.

Plus, from a market standpoint, Cyclone was developed to meet Apple's own needs. To sell these chips would introduce a "demand" variable into the equation that I think would stifle development.

At any rate, this is a really interesting topic because I think that Apple's oft-critized isolation actually worked much to its own benefit here.

[+] chucknelson|12 years ago|reply
Pretty impressive that Apple is capable of such CPU disruption with just a few small acquisitions. Is PA Semi the main reason for this, or could Apple have been building up a CPU design team for years before that acquisition?
[+] bryanlarsen|12 years ago|reply
Is a $270 million acquisition a small acquisition these days? Not to mention the fact that they acquired a team responsible for two large CPU disruptions: DEC Alpha and StrongARM. A third disruption is perhaps not so surprising.
[+] ddebernardy|12 years ago|reply
If memory serves, Cook mentioned that Apple's semiconductor division was larger than Intel during one of the last investor conference calls.
[+] stonemetal|12 years ago|reply
There's little benefit in going substantially wider than Cyclone, but there's still a ton of room to improve performance.

Didn't AMD, and Intel more or less say the same thing about 3 wide? No real benefit from going wider. Is that because of the differences in micro architecture or is it more about getting a little more performance without having to ramp the clock speed? What makes 6 wide good for ARM but not x86 or x64?

[+] fournm|12 years ago|reply
This is probably completely and utterly wrong as it's just a guess, but potentially the Thumb [1] instructions (small, limited subset of shorter instructions in ARM) might allow for a wider setup. Thumb instructions make a bunch of simplifying assumptions that might remove some of the issues with going wider. Not that I have any idea if Cyclone even supports them in the first place.

[1] http://en.wikipedia.org/wiki/ARM_architecture#Thumb

[+] leejoramo|12 years ago|reply
On balance, I think that Apple would use most of the energy savings from reducing to 20nm for other things: longer life and lower weight for the battery, improvements in the camera, wireless, increased RAM.
[+] antimagic|12 years ago|reply
In the iPhone definitely. For the iPad, it already has more than adequate battery life (most people get several days use out of one before needing to recharge), so the extra clock speed could be used to handle multiple apps on-screen simultaneously (for example).
[+] JTenerife|12 years ago|reply
Good on Apple! That's the best that can happen to us customers. Even for an Android / Windows guy like me :-).
[+] szatkus|12 years ago|reply
I wonder if Nvidia Denver could match with this chip.
[+] NextUserName|12 years ago|reply
Congrats Apple for all your achievements. It is amazing what can be accomplished when you employ Chinese Engineers and manufacturers who's tech espionage (they steal trade secrets) is one of their best known traits/assets.
[+] axman6|12 years ago|reply
Proof? Even informed speculation?

Apple have made several aquisitions in the last few years to give them the technology they need to make these sorts of developments. Even if some of the tech had been stolen, it would require a lot of work to put it into practice.

I know I'm feeding the troll, but I couldn't help it; this is just too ridiculous.

[+] raverbashing|12 years ago|reply
I am thinking that this is looking up to be what the G5 aimed to be.

Unfortunately the ARM architecture, even with those optimizations is probably slower "clock by clock" compared with x86/PPC

But yes, I think this is something Apple is probably testing (Desktop Mac OSX on ARM)