Price / GB of DRAM hasn't actually fallen much in the 10 years of progression.[1] LPDDR is still over $3/GB. UDIMM is still ~$3 /GB, which is about the same in 2010 / 2011. i.e Despite what you may heard about DRAM price collapse in 2019, the price floor of DRAM has been pretty much the same over the past 10 years.
Every other silicon has gotten cheaper, NAND, ICs, just not DRAM. And yet our need for DRAM is forever increasing. From In-Memory Datastore on Servers to Mobile Phones with Camera shooting rapid 4K images.
Compared to NAND, or Foundry like TSMC, there are clear roadmaps where cost is heading, and what cost reduction we could expect in the next 5 years, along with other outlook. There is nothing of sort in DRAM. At least I dont see anything to suggest we could see $2/GB DRAM, if not even lower. I dont see how EUV is going help either, there won't even be enough EUV TwinScan machines going around for Foundries in the next 3 years, let alone NAND and DRAM.
The only good news is the low / normal capacity ECC DRAM has finally fallen to ~$5/GB. ( They used to be $10-20/ GB ).
I would not be surprised at all if it were to come out that Samsung, Micron, and other major players are price-fixing just like they have been caught doing multiple times in the past. They seem to pay their fines as a cost of doing business and then continue to operate like a cartel. Seems this is the same situation, and I don't doubt this is directly responsible for inflated DRAM pricing.
One thing I'd like to understand better about DDR5 is how well the built-in ECC is going to work to improve reliability. DDR5 comes with "chip level ECC" [1] of which the main purpose is to be able to better sell highly complicated memory chips with minor defects.
But as a consequence as I understand, it will allow for the correction of single bit memory flips. With regular DDR4 or previous generations, you don't get any error correction. Any bit error in your DDR4 modules has the potential to corrupt data. If you want to be protected from that, you will need to get ECC memory.
Unfortunately, anything with "ECC" in hardware for unfortunate reasons gets labeled with an "enterprise" sticker. And that means a certain price level, and a certain power consumption. (Yes I know you can get Ryzen boxes that work with ECC, but that's still PC sized hardware for hundreds of dollars).
If DDR5 can bring error correction to the masses - like in single board computers, 10W NAS boxes, smartphones - that would be pretty cool. But I'm not sure whether my reading of that is correct.
I expect Intel to still cripple some aspect of DDR5 ECC on consumer chips; maybe it will correct errors but the memory controller won't report them. Or maybe it's possible to disable ECC even though it's already implemented.
I also expect servers to use two levels of ECC to provide chipkill and also to keep server RAM more expensive than consumer.
One of the great ironies of modern computers is we dispensed with ECC just as we started ballooning out the size of RAM and shrinking the transistors so that single bit errors were more likely. I'd be very grateful for system ECC.
I wouldn’t be surprised if at some level that physics has forced the manufacturer’s hand—that previously low error rates are now unacceptable when you multiply them by 64GB.
I actually tried really hard to get RAM to corrupt a bit for a school project and didn't manage a single bit flip.
How often have you actually heard of data corruption due to non-ECC memory? Either yourself, any degree of 'friend of a friend', or perhaps a study that looked into the matter with more success than I had. I don't mean a newspaper story because exceptional cases are reported because they're rare exceptions rather than common enough that we'd be likely to come across it in our lifetimes.
What I read around is that's not the enterprise ECC, it's more akin to ECC bits used in flash memory. It'll allow manufacturers to play fast and loose with memory.
"the shift from dedicated DDR4 memory controllers to Serdes-based, high speed differential signaling mixed with buffer chips on memory modules that can be taught to speak DDR4, DDR5, GDDR6, 3D XPoint, or whatever, is an important shift in system design and one that we think, ultimately, the entire industry will get behind eventually."
I think this makes sense for IBM, but probably not for AMD and Intel. AMD and Intel are putting out new chips every year (even if they're just Skylake refreshes), so it's not too big of a deal to change the memory interface, and they are capable of putting in two flavors of DDR support when some flexibility is needed. For IBM, I don't think they release incremental chip designs each year, so it makes more sense for them to take the tradeoff of a flexible interface, so they can get onboard with newer ram faster.
Adds a latency hop, which generally can be dealt with by prefetching and larger caches on the CPU side.
There's speculation that AMD is going to do the same - in Zen 2 and later designs the CPU chiplets are coupled with different IO dies depending on the design (Ryzen, Threadripper, Epyc), and swapping out the IO die for one that has support for new/different memory types would less work than taping out a whole new monolithic CPU.
I like to grab RAM towards the end of the generation, so far it has been the best bang for my buck (for my personal computer). From the production estimates, looks like I'll jump into DDR5 at the end of 2023 or most likely 2024
I have a 4690K and my RAM is a 32GB EVGA DDR3 ram that I suspect was made by hynix or the other manufacturer of fast rams, that runs at speed up to par with the slowest DDR4 but with DDR3 latency, it is very awesome.
I wonder if DDR5 will be fast enough to compensate for the slower latency (or maybe they improved latency this time?)
To be honest I didn't felt yet any need to move off my current machine, I would only upgrade its GPU, but I can't do that because I can't afford a new GPU AND a new Monitor (I use a CRT monitor with VGA cable... it is a very good monitor so no reason to replace it, but newer GPUs don't support it).
I was planning on same thing, until I estimated how much faster modern PC was at single threaded compilation tasks compared to my old one, and how it impacted my productivity. I should of upgraded earlier.
I try to refresh when a new ram gen launches with a new socket. That gives me a good shot at long term upgradability. Waiting on zen 4 now for my next complete build.
To my hardware colleagues on HN, what prevents something similar to Dennard Scaling on DRAMs?
My very naive textbook knowledge is that every bit for DRAM uses up a single transistor and a capacitor, whereas a SRAM cell uses up 6 transistors.
How is it then that with all the scaling so far that traditional SRAMs haven't caught up with DRAM capacities? A single DRAM chip is huge compared to the total die size of any micro-processor.
As the sibling comment asks about cheaper DRAMs, I'm trying to understand how using SRAMs haven't caught up yet from a price/GB perspective.
I don't know why you would expect a 6T SRAM cell to ever be smaller than a 1T DRAM cell given that both of them are scaling. Also, DRAM die sizes appear to be 40-80 sq. mm which is smaller than processors. https://www.semiconductor-digest.com/2019/09/13/dram-nand-an...
First, DRAM and SRAM are more than just the transistors, they are the lines going into each of the transistors carrying the signal. They are also all the control circuitry around those transistors. When you write out, you aren't just involving the 6 transistors to store, but rather a whole host of control transistors.
Next up, changes in current on a wire induce current on surrounding lines. This induced current is results in what's known as "cross talk". There are a bunch of methods to combat this, the primary one is to make sure there is enough space between lines to avoid it. This means that while your transistor size may get smaller and smaller, you still have a limit on how close you can place those transistors, otherwise you risk unwanted bit flips. DRAM has a major advantage here simply because it requires fewer lines to control state. That results in a more dense packing of memory.
With those two points in mind, there's simply no way for SRAM to ever have the same price/GB or density as DRAM (without the market screwing with prices).
> How is it then that with all the scaling so far that traditional SRAMs haven't caught up with DRAM capacities?
Leakage.
As FETs get smaller they leak more. CMOS Logic has dealt with this by having "dark silicon" -- yes, you get twice as many transistors as the last generation, but you can't use as many of them at the same time. You have to keep some of them turned off. But turning off SRAM means lost data, so "dark SRAM" is useless -- unlike, say a "dark vector unit" or "dark floating point unit".
DRAMs can optimize the entire process for just one job -- the 1T1C access gate -- to keep leakage at bay. Or if all else fails, just refresh more often, which hurts standby power but isn't a meaningful contributor to active-mode power.
SRAMs are catching up, but they are still much less dense, and are normally configured for smaller line sizes and lower latencies than DRAM. DRAM requires sense amplifiers and capacitors which have scaled both scaled slower than transistors.
From a systems perspective, lots of work has gone into hiding DRAM's faults highlighting its strong points, so a system where DRAM is replaced with SRAM will be more expensive but not realize most of the possible benefits without major redesigns of the memory systems.
Intel has some xeons with over 70MB of L3 and also released some eDRAM chips to play around with this idea, but notice they used eDRAM to get 128MB of L4 on a consumer chip - SRAM is still very expensive!
Will we get performance increases, and how big will they be in the average case, not for some specific codes with low cache hit ratio on large datasets, and attributed solely to bandwidth increases and not architecture IPC improvements?
"For bandwidth, other memory manufacturers have quoted that for the theoretical 38.4 GB/s that each module of DDR5-4800 can bring, they are already seeing effective numbers in the 32 GB/s range. This is above the effective 20-25 GB/s per channel that we are seeing on DDR4-3200 today."
That looks like a 20%+ improvement IF you are bottle-necking on DDR4.
Depends a lot on your CPU's architecture as well as workloads, how far ahead it prefetches vs how often it's stalled on large memory reads. So it's hard to know. e.g. Zen+ (Ryzen 2000) was seeing 10% in some gaming workloads going from DDR4-2400 to DDR4-3600, but it's much less drastic on Intel CPUs or even Zen 2 (Ryzen 3000) because the memory controller is smarter so the slower RAM is less of a detriment. And then if you go above 3600mhz (or 3800mhz if overclocked) on zen 2 you start getting negative returns for a bit because the CPU memory controller can no longer run at the same clock as the memory and that induces overhead. But maybe a 4800mhz if it can be stable easier gets far enough ahead of that penalty that the improvement goes positive again. Or maybe Zen 4/DDR5-lake just works with memory entirely differently and the performance gains are massive or neglible.
The short of it is it's very hard to make predictions here.
It uses CPU performance counters to show things like ITLB_Misses or MEM_Bandwidth. It won't show when you're waiting for GPU/SSD/etc because those aren't visible from CPU performance counters. I'm not aware of a single tool that will do everything, unfortunately.
Also, this isn't a "benchmarking suite"; it's a tool you can use to instrument whatever load you're running, which I'd say is better. It's often used to improve software but could also identify if faster RAM will help.
Benchmarking of what? Based on the task, you need a specific benchmark. If it's gaming, there are various benchmarks, run them, see utilization, whichever is not 100% is the bottleneck.
If it's computation, it's more complicated to discover the bottleneck (your problem may be cache misses, memory bandwidth, architecture that doesn't go well with the algorithm).
AMD will present its next generation Ryzen CPUs, based on Zen 3, the day after tomorrow, 08.10.2020 [0][1] - maybe we can get more info about DDR5 compatibility already then.
Although this question is more academic in nature, how "difficult" is memory training/initialization compared to DDR4? I recall an active microcontroller needing to calibrate the DRAM on startup for DDD4.
I haven't been able to find any specs on latency, and whether it has improved or not. I assume it hasn't, because it doesn't tend to, but does anyone know for sure?
If you thought HNers were insufferable whiners regarding RAM or NAND soldered to motherboards, just wait until they start marketing monolithic CPU+memory chips.
Hooray for FINALLY putting a local DC/DC converter ON THE DIMM so the motherboard can feed it with high-voltage/low-current power instead of low-voltage/high-current. The latter has become increasingly impractical (and noisy!)
Often the initial consumers would be enterprises instead of casual users. There are numerous enterprise use cases where higher bandwidth and lower latency would be worth the cost. Some that come to mind are in financial services and ML inferencing. I can imagine high-mem compute instances of cloud service providers being an obvious place for these.
Also, going for 2 to 1 seconds is pretty huge if you're doing some operation hundreds or thousands of times a day.
A casual user like you ends up using quite complex compute on the cloud, when you watch a video on youtube, scroll a newsfeed, and make an airline booking. Some of the advanced complexities of those tasks, when programmed well by good engineers, can become feasible with this.
All the Machine Learning related buzzwords exist mainly because they have only now become computationally feasible. You never know what will come next.
ksec|5 years ago
Price / GB of DRAM hasn't actually fallen much in the 10 years of progression.[1] LPDDR is still over $3/GB. UDIMM is still ~$3 /GB, which is about the same in 2010 / 2011. i.e Despite what you may heard about DRAM price collapse in 2019, the price floor of DRAM has been pretty much the same over the past 10 years.
Every other silicon has gotten cheaper, NAND, ICs, just not DRAM. And yet our need for DRAM is forever increasing. From In-Memory Datastore on Servers to Mobile Phones with Camera shooting rapid 4K images.
Compared to NAND, or Foundry like TSMC, there are clear roadmaps where cost is heading, and what cost reduction we could expect in the next 5 years, along with other outlook. There is nothing of sort in DRAM. At least I dont see anything to suggest we could see $2/GB DRAM, if not even lower. I dont see how EUV is going help either, there won't even be enough EUV TwinScan machines going around for Foundries in the next 3 years, let alone NAND and DRAM.
The only good news is the low / normal capacity ECC DRAM has finally fallen to ~$5/GB. ( They used to be $10-20/ GB ).
[1] https://secureservercdn.net/166.62.107.55/ff6.d53.myftpuploa...
tristor|5 years ago
baybal2|5 years ago
It has been 10 years since the last company not known to be a party to DRAM cartels has left the market.
tw04|5 years ago
https://en.wikipedia.org/wiki/DRAM_price_fixing
lucisferre|5 years ago
smueller1234|5 years ago
x87678r|5 years ago
zmj|5 years ago
DCKing|5 years ago
But as a consequence as I understand, it will allow for the correction of single bit memory flips. With regular DDR4 or previous generations, you don't get any error correction. Any bit error in your DDR4 modules has the potential to corrupt data. If you want to be protected from that, you will need to get ECC memory.
Unfortunately, anything with "ECC" in hardware for unfortunate reasons gets labeled with an "enterprise" sticker. And that means a certain price level, and a certain power consumption. (Yes I know you can get Ryzen boxes that work with ECC, but that's still PC sized hardware for hundreds of dollars).
If DDR5 can bring error correction to the masses - like in single board computers, 10W NAS boxes, smartphones - that would be pretty cool. But I'm not sure whether my reading of that is correct.
[1]: https://www.anandtech.com/comments/15912/ddr5-specification-...
wmf|5 years ago
I also expect servers to use two levels of ECC to provide chipkill and also to keep server RAM more expensive than consumer.
NelsonMinar|5 years ago
klodolph|5 years ago
lucb1e|5 years ago
How often have you actually heard of data corruption due to non-ECC memory? Either yourself, any degree of 'friend of a friend', or perhaps a study that looked into the matter with more success than I had. I don't mean a newspaper story because exceptional cases are reported because they're rare exceptions rather than common enough that we'd be likely to come across it in our lifetimes.
kasabali|5 years ago
nine_k|5 years ago
Zenst|5 years ago
https://www.nextplatform.com/2020/09/03/the-memory-area-netw...
"the shift from dedicated DDR4 memory controllers to Serdes-based, high speed differential signaling mixed with buffer chips on memory modules that can be taught to speak DDR4, DDR5, GDDR6, 3D XPoint, or whatever, is an important shift in system design and one that we think, ultimately, the entire industry will get behind eventually."
toast0|5 years ago
zdw|5 years ago
There's speculation that AMD is going to do the same - in Zen 2 and later designs the CPU chiplets are coupled with different IO dies depending on the design (Ryzen, Threadripper, Epyc), and swapping out the IO die for one that has support for new/different memory types would less work than taping out a whole new monolithic CPU.
formerly_proven|5 years ago
morceauxdebois|5 years ago
Spooks|5 years ago
speeder|5 years ago
I wonder if DDR5 will be fast enough to compensate for the slower latency (or maybe they improved latency this time?)
To be honest I didn't felt yet any need to move off my current machine, I would only upgrade its GPU, but I can't do that because I can't afford a new GPU AND a new Monitor (I use a CRT monitor with VGA cable... it is a very good monitor so no reason to replace it, but newer GPUs don't support it).
pjc50|5 years ago
josmala|5 years ago
gnarbarian|5 years ago
satisfaction|5 years ago
robohydrate|5 years ago
filereaper|5 years ago
My very naive textbook knowledge is that every bit for DRAM uses up a single transistor and a capacitor, whereas a SRAM cell uses up 6 transistors.
How is it then that with all the scaling so far that traditional SRAMs haven't caught up with DRAM capacities? A single DRAM chip is huge compared to the total die size of any micro-processor.
As the sibling comment asks about cheaper DRAMs, I'm trying to understand how using SRAMs haven't caught up yet from a price/GB perspective.
wmf|5 years ago
I don't know why you would expect a 6T SRAM cell to ever be smaller than a 1T DRAM cell given that both of them are scaling. Also, DRAM die sizes appear to be 40-80 sq. mm which is smaller than processors. https://www.semiconductor-digest.com/2019/09/13/dram-nand-an...
cogman10|5 years ago
First, DRAM and SRAM are more than just the transistors, they are the lines going into each of the transistors carrying the signal. They are also all the control circuitry around those transistors. When you write out, you aren't just involving the 6 transistors to store, but rather a whole host of control transistors.
Next up, changes in current on a wire induce current on surrounding lines. This induced current is results in what's known as "cross talk". There are a bunch of methods to combat this, the primary one is to make sure there is enough space between lines to avoid it. This means that while your transistor size may get smaller and smaller, you still have a limit on how close you can place those transistors, otherwise you risk unwanted bit flips. DRAM has a major advantage here simply because it requires fewer lines to control state. That results in a more dense packing of memory.
With those two points in mind, there's simply no way for SRAM to ever have the same price/GB or density as DRAM (without the market screwing with prices).
octoberfranklin|5 years ago
Leakage.
As FETs get smaller they leak more. CMOS Logic has dealt with this by having "dark silicon" -- yes, you get twice as many transistors as the last generation, but you can't use as many of them at the same time. You have to keep some of them turned off. But turning off SRAM means lost data, so "dark SRAM" is useless -- unlike, say a "dark vector unit" or "dark floating point unit".
DRAMs can optimize the entire process for just one job -- the 1T1C access gate -- to keep leakage at bay. Or if all else fails, just refresh more often, which hurts standby power but isn't a meaningful contributor to active-mode power.
verall|5 years ago
From a systems perspective, lots of work has gone into hiding DRAM's faults highlighting its strong points, so a system where DRAM is replaced with SRAM will be more expensive but not realize most of the possible benefits without major redesigns of the memory systems.
Intel has some xeons with over 70MB of L3 and also released some eDRAM chips to play around with this idea, but notice they used eDRAM to get 128MB of L4 on a consumer chip - SRAM is still very expensive!
nullifidian|5 years ago
brixon|5 years ago
That looks like a 20%+ improvement IF you are bottle-necking on DDR4.
Macha|5 years ago
The short of it is it's very hard to make predictions here.
gameswithgo|5 years ago
aresant|5 years ago
eg as simple as CPU utilization, GPU utilization, RAM speed utilization?
scottlamb|5 years ago
It uses CPU performance counters to show things like ITLB_Misses or MEM_Bandwidth. It won't show when you're waiting for GPU/SSD/etc because those aren't visible from CPU performance counters. I'm not aware of a single tool that will do everything, unfortunately.
Also, this isn't a "benchmarking suite"; it's a tool you can use to instrument whatever load you're running, which I'd say is better. It's often used to improve software but could also identify if faster RAM will help.
wickedsickeune|5 years ago
If it's computation, it's more complicated to discover the bottleneck (your problem may be cache misses, memory bandwidth, architecture that doesn't go well with the algorithm).
Point is, you need to specify the task.
freedomben|5 years ago
tlamponi|5 years ago
[0]: https://twitter.com/AMDRyzen/status/1312080706739339266?
[1]: https://www.amd.com/en/events/gaming-2020?sf238352749=1&sf23...
jtl999|5 years ago
gameswithgo|5 years ago
IanCutress|5 years ago
unknown|5 years ago
[deleted]
nullc|5 years ago
f00zz|5 years ago
jeffbee|5 years ago
fomine3|5 years ago
https://www.anandtech.com/show/15877/intel-hybrid-cpu-lakefi...
the_hoser|5 years ago
fomine3|5 years ago
octoberfranklin|5 years ago
tus88|5 years ago
orliesaurus|5 years ago
grishka|5 years ago
TuringNYC|5 years ago
Also, going for 2 to 1 seconds is pretty huge if you're doing some operation hundreds or thousands of times a day.
mirsadm|5 years ago
(putting aside the the fact that faster RAM does not generally result in faster load times)
nullifidian|5 years ago
bhaavan|5 years ago
zionic|5 years ago
TedShiller|5 years ago