I'm excited for what this will do to the cost of dedicated servers in ~1 year.
Also, as a person who used to work at Intel, I don't know whose idea this was, but that person should probably have a long hard look at themselves -- hardware people are exactly the people that this kind of shit wouldn't fly with, because they'll almost always ask for details and can spot a hack from a mile away.
On the one hand I can sympathize with Intel -- seeing how tough it was to stay on the market year over year, trying to predict and start developing the next trend in hardware. But on the other hand... Why in the world would you do this -- Intel basically dominates the high end market right now, just take your time and make a properly better thing.
> I'm excited for what this will do to the cost of dedicated servers in ~1 year.
This is the opposite though?
The dedicated servers are turning into HEDTs. AMD 32-core EPYC has been available since last year, and Intel's 28-core Skylake (although $10,000) has been also available for a year.
So dedicated servers got this tech first, then HEDT got it a bit later. I guess Threadripper is Zen+ so its technically HEDT gets the 12nm tech first, but the 32-core infrastructure was in EPYC first.
I get that Intel feels threatened by AMD. They are trying to impress the consumers... but bullshitting a demo is a very bad move! When a consumer decides to build a new PC, the characteristics of the product matter, but so does the reputation of the company that manufactures it. Right now Intel is putting too much effort into sketchy marketing practices: it undermines the actual work being done on their processors by some very talented people.
Presenting it as an extreme overclocking demo would have been a much wiser option.
Unfortunately it might work. With today's news cycles, an average consumer may have noticed the headline about Intel's 5GHz 28-core monster, and that's it. Follow up articles aren't as interesting.
That would be akin to i7 8086k on 7.2GHz... just add LN2.
What they presented was an extreme overclocking area with insulation (due to subambient condensation), a one HP chiller that runs on a banned gas and a 4 second benchmark (seriously aside CB being free is not a true testament of overclocking prowess). Such a demo is as pointless as it is almost as practical as daily LN2 use.
I can imagine few cases where first-to-right-the-bell performance on a single core determines if you get a specific quote in HFT but that's that.
It was always going to be that way. Threadripper 2 is just Threadripper with the two blank spacers in the IHS replaced with two more Zen dies, and a fab process change from 14nm to 12nm (but not architectural changes). It's very much what would be a "tick" in Intel CPU terminology.
I recall reading that the thread ripper 2 is supposed to use more power so early motherboards with that socket my not be able to handle it. Just something to keep in mind.
AMD's forward compatibility is one of the most amazing things to me, so much so that I still can't believe AM4 is going to remain viable for future CPUs for a while hence despite its age.
It's also a big reason I'm not going with Intel, since I know I can upgrade to something significantly better without having to get a new motherboard.
I just recently replaced my old i7 920 in the homeserver with an AMD Ryzen 5 2600. Really like it so far. Price / performance is great. This is my first AMD since probably ever....
The two things I don't like is that their CPUs are pin based. It seemed kind of old fashioned after Intel CPUs. But this is really a minor thing. The other issue is memory compatibility is a bit finicky. Maybe it has to do with the CPU being so new. Not sure.
Ryzen chips scale in performance more than Intel when you overclock the RAM. Some part of the chip cache is more tightly coupled to the RAM latency, and my rampant speculation is that Intel doesn’t really care about memory bandwidth as much on the desktop anyways.
As an outsider to 'enterprise-grade' computing, I'm curious about situations where a high number of cores in a single processor would be superior to multiple processors with the same total energy draw sitting on a single motherboard?
I can understand HPC applications where the high-speed interconnect on the chip would make a big difference.
But in business applications where the cores are dedicated to running independent VMs, or are handling independent client requests, what is really gained? There would still be some benefits from a shared cache, but how large quantitatively would that be?
I am on POWER8 at work, the wiki article [1] gives a great description of the advantages of many cores per chip though ours only has 6/12 cores. Part of our hardware configuration to migrate from POWER7 to POWER8 was to have 40g of memory per core available. I think POWER7 was 30g. We use this in the iSeries environment but we have pSeries machines with the same hardware running AIX/Oracle and POWER7 VMs running many *nix implementations.
In my usage case, the core/thread count really helps DB2's SQL implementation as an iSeries is effectively a giant DB2 database with extras added on. Hence query engine (SQE/CQE see old doc [2] on our machine can make great use of many cores/threads. When serving data to intensive batch applications as well as thousands of warehouse users and double that through web services access to data is the name of the game.
NUMA. Latency between sockets is far higher than in a single socket. If your workload is truly wholly independent threads as you've described, then it's quite possible there is no benefit. (Although, sibling comments bring up good points about licensing fees.)
First is that a single-socket motherboard is still a simpler design to produce with all the advantages that entails.
Second is that you’re allowed to stick two of these on a two-socket board for CPU-bound loads. Better density for when you have the thermal capacity to spare.
> As an outsider to 'enterprise-grade' computing, I'm curious about situations where a high number of cores in a single processor would be superior to multiple processors with the same total energy draw sitting on a single motherboard?
Databases are the big one I'm aware of.
Intel's L3 cache is truly unified. Intel's 28-core Skylake means that the L3 of a Database is TRULY 38.5MB. When any core requests data, it goes into the giant distributed L3 cache that all cores can access efficiently.
AMD's L3 cache however is a network of 8MB chunks. Sure, there's 32MB of cache in its 32-core system, but any one core can only use 8MB of it effectively.
In fact, pulling memory off of a "remote L3 cache" is slower (higher latency) than pulling it from RAM on the Threadripper / EPYC platform. (A remote L3 pull has to coordinate over infinity fabric and remain cohesive! So that means "invalidating" and waiting to become the "exclusive owner" before a core can start writing to a L3 cache line, well according to MESI cc-protocol. I know AMD uses something more complex and efficient... but my point is that cache-coherence has a cost that becomes clear in this case. ) Which doesn't bode well for any HPC application... but also for Databases (which will effectively be locked to 8MB per thread, with "poor sharing", at least compared to Xeon).
Of course, "Databases" might be just the most common HPC application in the enterprise, that needs communication and coordination between threads.
Which one of these companies does at better job with free/libre software? I've always had a soft spot for AMD because it's the underdog, but I want to make sure that they are free, too.
They're about the same. Both contribute to linux. Intel's GPU drivers are more complete and open, but their GPUs are not in the same league as AMDs. AMD has open and closed source versions of their driver and are moving more towards the open version (which is already very good). Both companies have closed-source initial boot and AMT-like tech with potential backdoors built into their CPUs.
AMD has been doing a lot of hardwork to get their stuff into mainline (AMDGPU as a recent example) and they opensource a lot of their GPU stuff on github.
On the CPU side, AMD has patched Linux way before Ryzen was available in shops and has been contributing various patches afterwards.
I'd say they are working to get a decent track record for their Ryzen and Vega lineups.
AMD did a great job with Threadripper, making high end CPUs much more affordable. It's interesting that Intel doesn't lower their prices. What's the logic behind it?
They spent such a long time making the "best" that now they get to ride that goodwill for a while with consumers, regardless of where they are presently with respect to competition. Toyota and Honda enjoy the same luxury, they outsell the competition today more because of what they did in 90s than what they did in 2016-17.
I'm of the mind that AMD's smaller cores working together is the secret sauce to their price advantage.
Intel has done an amazing done stuffing 28 cores into one piece of silicon and extracting as much performance as possible all for the low price of $10k.
AMD took their 8 core part that they are selling essentially up and down their product line... and slapped 4 of them together.
Intel was selling 18-core chips for >$2,400 then when Threadripper came out Intel released the 18-core i9 for "only" $2,000, so that is something of a price drop.
Also, Intel's $350 8700K was cheaper at launch than AMD's $450 1800X even though the 8700K is faster in gaming.
For a long time I saved a copy of a publication by Motorola about how Intel played fast and loose with benchmarks in comparisons of the 80386 with the 68020. (I lost it in a move, alas.) Can't say I was surprised to read about the 28-core fiasco.
This is a game all CPU manufacturers play. I still remember the days of Apple claiming their G4 processors were faster than Intel ones. Then they swapped platforms, and all of those claims evaporated without a trace.
This is a short term loss for Intel, but could end up being a long term win as an attack on AMD. Making this announcement forced AMD to advance their plans for the
32-core, possibly faster than they really wanted to right now. That depletes their product pipeline faster, making it more difficult to keep pace with future advances.
Edit: initial reports said that AMD was only planning to announce the 24-core CPU, and may have advanced the announcement of the 32-core chip due to Intel’a stunt. TFA doesn’t mention that, so possibly the initial reports were not accurate.
They maxed out the number of cores they can ship in a single CPU for now but that doesn't seem like a problem.
AMD will already launch their 7nm EPYC processor based on Zen 2 in 2019 (skipping Zen+ used by the new Threadripper and Ryzen 2xxx) which is expected to have 48 cores (some rumors even suggest 64 cores but that seems more likely for 7nm EUV instead of the first 7nm processes). So they will have no problem releasing more cores with Threadripper 3 next year (if they keep the yearly releases).
On top of that, in my layman eyes AMDs aproach of using infinity fabric to connect dies seems better suited to react to changes compared to Intels monolithic design.
I think of AMD's current approach - a microarchitecture with slower cores, but more cores, than Intel - as very similar to what Sun/Oracle tried to do from 2005 to 2010 with the Niagara family (UltraSPARC T1-T3).
Each core in those chips was seriously underclocked compared to a Xeon of similar vintage and price point (1-1.67 GHz; compared to 1.6 GHz to 3 or more), and lacked features like out-of-order execution and big caches that are almost minimum requirements for a modern server CPU. Sun hoped to make up for the slow cores in server applications with having more cores and having multiple threads per core (though with a simpler technology than SMT/hyper-threading).
However, Oracle eventually decided to focus on single-threaded performance with its more recent chips - it turns out that no OoO and < 2 GHz nominal speeds look pretty bad for many server applications. My suspicion is that even though the CPU-bound parts of games are becoming more multi-threaded, AMD will be forced to fix its slower architecture or lose out to Intel again in the server AND high-end desktop markets in a few years.
[+] [-] hardwaresofton|7 years ago|reply
Also, as a person who used to work at Intel, I don't know whose idea this was, but that person should probably have a long hard look at themselves -- hardware people are exactly the people that this kind of shit wouldn't fly with, because they'll almost always ask for details and can spot a hack from a mile away.
On the one hand I can sympathize with Intel -- seeing how tough it was to stay on the market year over year, trying to predict and start developing the next trend in hardware. But on the other hand... Why in the world would you do this -- Intel basically dominates the high end market right now, just take your time and make a properly better thing.
[+] [-] dragontamer|7 years ago|reply
This is the opposite though?
The dedicated servers are turning into HEDTs. AMD 32-core EPYC has been available since last year, and Intel's 28-core Skylake (although $10,000) has been also available for a year.
So dedicated servers got this tech first, then HEDT got it a bit later. I guess Threadripper is Zen+ so its technically HEDT gets the 12nm tech first, but the 32-core infrastructure was in EPYC first.
[+] [-] lbill|7 years ago|reply
Presenting it as an extreme overclocking demo would have been a much wiser option.
[+] [-] jacob019|7 years ago|reply
[+] [-] xxs|7 years ago|reply
I can imagine few cases where first-to-right-the-bell performance on a single core determines if you get a specific quote in HFT but that's that.
[+] [-] dingo_bat|7 years ago|reply
[+] [-] letsgetphysITal|7 years ago|reply
Edit: Plus, the TR4 socket is guaranteed to be supported for 4 years, per AMD's roadmap at https://community.amd.com/thread/226363
[+] [-] nikanj|7 years ago|reply
[+] [-] havemylife|7 years ago|reply
Regardless damn good on AMD.
[+] [-] kmfrk|7 years ago|reply
It's also a big reason I'm not going with Intel, since I know I can upgrade to something significantly better without having to get a new motherboard.
[+] [-] chrisper|7 years ago|reply
The two things I don't like is that their CPUs are pin based. It seemed kind of old fashioned after Intel CPUs. But this is really a minor thing. The other issue is memory compatibility is a bit finicky. Maybe it has to do with the CPU being so new. Not sure.
[+] [-] AndrewDavis|7 years ago|reply
To me that's a win. If a CPU pin is bent it's typically fairly easy to straighten it. Fixing a bent pin a socket is a massive pain.
But it's much easier to protect socket pins with the cover. So there are pros and cons either way.
[+] [-] CoolGuySteve|7 years ago|reply
[+] [-] iamforreal|7 years ago|reply
[+] [-] yomritoyj|7 years ago|reply
I can understand HPC applications where the high-speed interconnect on the chip would make a big difference.
But in business applications where the cores are dedicated to running independent VMs, or are handling independent client requests, what is really gained? There would still be some benefits from a shared cache, but how large quantitatively would that be?
[+] [-] wmantly|7 years ago|reply
[+] [-] downrightmike|7 years ago|reply
[+] [-] Shivetya|7 years ago|reply
In my usage case, the core/thread count really helps DB2's SQL implementation as an iSeries is effectively a giant DB2 database with extras added on. Hence query engine (SQE/CQE see old doc [2] on our machine can make great use of many cores/threads. When serving data to intensive batch applications as well as thousands of warehouse users and double that through web services access to data is the name of the game.
[1]https://en.wikipedia.org/wiki/POWER8 [2] https://www.ibm.com/support/knowledgecenter/en/ssw_i5_54/rza... <- that is quite a few years old but describes the query engines available - CQE is 'legacy' and SQE is modern
[+] [-] loeg|7 years ago|reply
[+] [-] pdpi|7 years ago|reply
First is that a single-socket motherboard is still a simpler design to produce with all the advantages that entails.
Second is that you’re allowed to stick two of these on a two-socket board for CPU-bound loads. Better density for when you have the thermal capacity to spare.
[+] [-] dragontamer|7 years ago|reply
Databases are the big one I'm aware of.
Intel's L3 cache is truly unified. Intel's 28-core Skylake means that the L3 of a Database is TRULY 38.5MB. When any core requests data, it goes into the giant distributed L3 cache that all cores can access efficiently.
AMD's L3 cache however is a network of 8MB chunks. Sure, there's 32MB of cache in its 32-core system, but any one core can only use 8MB of it effectively.
In fact, pulling memory off of a "remote L3 cache" is slower (higher latency) than pulling it from RAM on the Threadripper / EPYC platform. (A remote L3 pull has to coordinate over infinity fabric and remain cohesive! So that means "invalidating" and waiting to become the "exclusive owner" before a core can start writing to a L3 cache line, well according to MESI cc-protocol. I know AMD uses something more complex and efficient... but my point is that cache-coherence has a cost that becomes clear in this case. ) Which doesn't bode well for any HPC application... but also for Databases (which will effectively be locked to 8MB per thread, with "poor sharing", at least compared to Xeon).
Of course, "Databases" might be just the most common HPC application in the enterprise, that needs communication and coordination between threads.
[+] [-] vbezhenar|7 years ago|reply
[+] [-] rocky1138|7 years ago|reply
[+] [-] rcxdude|7 years ago|reply
[+] [-] zaarn|7 years ago|reply
On the CPU side, AMD has patched Linux way before Ryzen was available in shops and has been contributing various patches afterwards.
I'd say they are working to get a decent track record for their Ryzen and Vega lineups.
[+] [-] solomatov|7 years ago|reply
[+] [-] lsiq|7 years ago|reply
[+] [-] glandium|7 years ago|reply
Edit: typo
[+] [-] dboreham|7 years ago|reply
[+] [-] jpalomaki|7 years ago|reply
[+] [-] undersuit|7 years ago|reply
Intel has done an amazing done stuffing 28 cores into one piece of silicon and extracting as much performance as possible all for the low price of $10k.
AMD took their 8 core part that they are selling essentially up and down their product line... and slapped 4 of them together.
[+] [-] loeg|7 years ago|reply
[+] [-] wmf|7 years ago|reply
Also, Intel's $350 8700K was cheaper at launch than AMD's $450 1800X even though the 8700K is faster in gaming.
[+] [-] jejones3141|7 years ago|reply
[+] [-] nikanj|7 years ago|reply
[+] [-] NullPrefix|7 years ago|reply
[+] [-] jopython|7 years ago|reply
[+] [-] MBCook|7 years ago|reply
AMD’s interconnect seems fast enough, and they don’t have the yield/cost problems from massive single die chips.
[+] [-] greglindahl|7 years ago|reply
[+] [-] bluedino|7 years ago|reply
[+] [-] vbezhenar|7 years ago|reply
[+] [-] ben509|7 years ago|reply
[+] [-] orev|7 years ago|reply
Edit: initial reports said that AMD was only planning to announce the 24-core CPU, and may have advanced the announcement of the 32-core chip due to Intel’a stunt. TFA doesn’t mention that, so possibly the initial reports were not accurate.
[+] [-] BlackMonday|7 years ago|reply
AMD will already launch their 7nm EPYC processor based on Zen 2 in 2019 (skipping Zen+ used by the new Threadripper and Ryzen 2xxx) which is expected to have 48 cores (some rumors even suggest 64 cores but that seems more likely for 7nm EUV instead of the first 7nm processes). So they will have no problem releasing more cores with Threadripper 3 next year (if they keep the yearly releases).
On top of that, in my layman eyes AMDs aproach of using infinity fabric to connect dies seems better suited to react to changes compared to Intels monolithic design.
[+] [-] glonq|7 years ago|reply
[+] [-] Phylter|7 years ago|reply
[+] [-] hyperrail|7 years ago|reply
Each core in those chips was seriously underclocked compared to a Xeon of similar vintage and price point (1-1.67 GHz; compared to 1.6 GHz to 3 or more), and lacked features like out-of-order execution and big caches that are almost minimum requirements for a modern server CPU. Sun hoped to make up for the slow cores in server applications with having more cores and having multiple threads per core (though with a simpler technology than SMT/hyper-threading).
However, Oracle eventually decided to focus on single-threaded performance with its more recent chips - it turns out that no OoO and < 2 GHz nominal speeds look pretty bad for many server applications. My suspicion is that even though the CPU-bound parts of games are becoming more multi-threaded, AMD will be forced to fix its slower architecture or lose out to Intel again in the server AND high-end desktop markets in a few years.