I can't wait to see the programming paradigm of memory vs permanent storage begin to blur in the next 5 or so years. It's going to make some major assumptions about how you program stuff change quite significantly and it's really exciting.
HP is already working on this with what they're calling "The Machine". They're writing a new OS around the new storage technologies with the understanding that it might not make sense to separate RAM from disk pretty soon here. They've also suggested that they'll be porting versions of Linux and Android to this platform.
Of course, all of this is pretty marketing heavy and not a lot of details have been released, so take this with a huge grain of salt. HP's CTO wrote a fluff piece on flattening the storage hierarchy here: http://www8.hp.com/hpnext/posts/end-necessary-evil-collapsin...
Memristors (long promised and still not quite here) also bring together the ideas of long-term and short-term memory into one and, in future incarnations, even promise to bring computation and storage together.
Traditional memory continues to become cheaper too. Flash is dimes a gigabyte, disk is cents a gig and videotape tenths of a cent per gigabyte. So supercomputers will always be maxing out each level of memory. And you will still need worry about memory hierarchies.
For ordinary computing, maybe a terabyte of flash will finally be enough.
How is this different than IBM's single store model? Memory and disk are addressed as one with hardware providing the protection needed in case of component or system failure.
Unless your writing at the machine level I really do not see the need for most programmers to ever concern themselves with storage, that should be the work of the underlying machine code and/or OS.
IBM went all the way with this in the AS/400 back in the 80's. Single level store - RAM and disk were unified storage. with no ability for a programmer to tell the difference (RAM was purely and operations problem).
This is sort of the opposite of the RAM-based battery backed drives [1]. I can see this being immediately useful for things like large database servers: instead of re-priming caches on reboots you just have them already warmed up. You can also suddenly have a whole lot more "RAM" at the cost of its speed.
I do have a hard time picturing what this will look like if it was as fast as traditional RAM. If I can store everything in RAM, from the OS binaries, to the running processes, it certainly has a kind of elegance to it. Lots of microcontrollers already act this way: all your hardware has an address in a single address space, be at a hardware port, ROM, RAM, NVRAM, etc. However, I wonder how the modern UNIX OS will work with a system like this without block devices at all. I wonder if what it'd do is actually just use RAM disks, and years from now we'll still be doing this the way we do double emulation of the TTY. That'd be kind of sad since it means we can never get away from the concept of old block devices.
On the other hand it's damn convenient to be able to just append to a file and not have to worry about reallocating it. This means that perhaps all we'd need is a filesystem that's designed to work over RAM rather than over block devices.
The file system (and associated baggage) exist because of a fundamental disparity between the speed (and nearness to compute) of smaller memories and the convenience (but relative slowness) of larger ones. While SSDs do certainly close this gap, I find it difficult to imagine that this gap will ever entirely close; if anything, physics guarantees that compute units with small, nearby memories will always be far, far faster than more distant memories. You might argue that SSDs are "fast enough" to dispense with the inconvenience that imposes, but I can certainly find things I'd rather do with the performance.
While block devices can go away, the notion of a file will probably continue to remain.
We run clusters of ephemeral linux hosts, and while we use disk, we also have plenty of hosts that run diskless. The disk mount point just becomes a tmpfs mount and the required minimal installation is made on a tmpfs partition which amounts to a trivial amount for hosting busybox and some conf files in /etc. Linux does appear to understand that tmpfs backed filestores are already in memory and do not need to synchronize pages and flush out dirty pages. That whole caching/memory layer for this fake block device is just not used at all.
I do wish the iRam had stuck around longer. The 4GB limit makes it practically useless today, but a modern version of the same would be incredibly useful for some workloads.
"However, I wonder how the modern UNIX OS will work with a system like this without block devices at all."
I don't think that's how it works - I think you get a special driver that presents the fast-storage-in-dimm-slot as a block device. So you still have a block device, it's just faster.
> If I can store everything in RAM, from the OS binaries, to the running processes, it certainly has a kind of elegance to it.
Didn't Palm OS originally work this way? The early devices were too cheap for flash memory for anything more than the base OS, so they just stored everything in battery-backed RAM. If your batteries ran out, your device was wiped and had to be re-synced. Apps were run in-place in RAM. Then they had to hack in a bunch of workarounds when they added flash memory storage.
You can tell SanDisk's performance numbers do not add up and that they are likely misrepresenting the true performance of their device. (Those red asterisks next to the numbers correspond to a footnote that is conveniently missing from the page...) A "read latency of 150usec" translates to maximum possible read IOPS rate of 1/150e-6 = 6.67K (with one outstanding I/O). But they quote a "random read IOPS of 140K". That would only be possible if their DDR3-based DIMM could process 21 concurrent read I/O operations. But to the best of my knowledge, DDR3 is limited to 8 banks/DIMM, so there could not possibly be more than 8 concurrent read I/O at any one time. Far from 21.
So SanDisk is likely quoting a worst case read latency and/or a best case read IOPS. Customers are left to themselves to figure out which of these numbers is most likely to represent the average performance...
But to the best of my knowledge, DDR3 is limited to 8 banks/DIMM, so there could not possibly be more than 8 concurrent read I/O at any one time.
Maybe you can submit more than 8 queued requests to the controller and have them stream back at the full DDR3 data rate. Maybe it's not actually addressable as RAM, it just uses the DDR3 interface as a communication bus.
I appreciate this might not be the right way to look at it, but from a trivia POV, in terms of raw performance, what era of regular memory would be comparable with it? (i.e. "typical desktop memory in 2004", say.)
and a video interview and demo with a SanDisk manager at a computer fair - http://www.youtube.com/watch?v=jarsTLGXx9c
(currently only available for OEM, requires special bios setup)
I don't quite understand. Does this act like a stick of memory or an SSD? Is it just using the memory controller as a super fast parallel bus? If it does act like normal memory, what happens at reboot?
Answer is there in FAQ. Written data is not lost on reboot, so basically you can mount it as a `ramdisk` partition and use that as data directory for your database. I have used this approach in past to speed up large test suite of a web application, except it required restoring from some sort of backup on reboot. Which won't be necessary in this case.
It doesn't look exceptionally parallel -- the throughput figures are similar to those in PCIe SSDs. 880 MB/s doesn't come close to saturating even a PCIe bus (4 GB/s, for v2 8x), let alone DDR3 (13 GB/s per channel).
For all of us who are thinking "it's just like RAM, just persistent!", here's some perspectives about what persistent RAM means for OSes and applications: http://lwn.net/Articles/610174/
This is pretty cool, especially since the data is persistent. You could put an entire data warehouse onto one or of a couple of these things, update it once a week and get amazing response time.
But I thought the main drawback of SSDs was that eventually the individual memory cells will degrade and lose the ability to write new data. I don't see anything about endurance other than a MTBF of 2.5 million hours and the disk-writes-per-day (DWPD), which I had to look up. I have no idea if these are good or bad. I feel like these things will be great if you didn't write new data a lot, or you have deep pockets to replace them when you need to.
2.5 Mhours MTBF equates to 285 years. That's a long time.
On any kind of large-scale installation you need redundancy, but if those numbers pan out then you could expect any single SSD to outlast your organisation.
How come read latency is 150 µs, while write latency is 5µs, 30 times shorter? Do they mean the latency to start a write operation? IIRC, flash memory is written block by block, with a pretty significant time to write one block.
The capacities are very odd - you'd expect something in a DIMM format to have a power-of-2 size.
I think the market for this could be much bigger if it behaved like a regular RAM DIMM, only slower and nonvolatile; it somewhat reminds me of old machines that used magnetic core RAM. This could be useful for laptops, like a zero-power suspend-to-(NV)RAM. The only thing that is worrying is the endurance of the flash - especially if it's being treated almost like RAM in this application.
I look forward to the point where 3/4 of business application code will go the trash when we'll have a persistant memory with the latency of actual RAM disk (Memristor !!). Exit all that "copy from RAM to Disk/Network" code.
It will definitely change the way we code and look at code. That's why nowadays I think a good interface to your entities, like the Repository pattern, is a must have.
Is the advantage the low latency because the rest of the specifications seem to be pretty standard for an SSD. Does this require a BIOS patch of some sort?
The RAM slot helps for latency and throughput. SATA 3.0 would restricts this drive's read throughout from 880MB/s to less than 600MB/s, and would increase the write latency (5usec) by at least 1 or 2usec because of the typical CPU->southbridge->PCI express->SATA link it would go through, but here it is CPU->RAM slot. 1-2usec might sound negligible but it's not: it's 20-40% worse latency, so 20-40% fewer write IOPS.
SATA 3.2 (2000MB/s) would help for throughput but not for latency. It is also still quite uncommon even amongst latest generation servers, whereas this SSD is compatible with many current and older generation servers.
If it behaves like RAM, from the POV of the OS, I'd imagine the advantage to be the large size. There's also the fact that it won't lose its data on power loss.
Of course, if it behaves like a regular SSD, the read speeds still exceed SATA3.0 speed.
I would still say that the real performance bottleneck is ultimately the bandwidth between the CPU and this memory. This suggests that the next stage will be to incorporate heterogenous processors alongside that memory - thus upgrading your computer could then be as simple as plugging in an another combined non-volatile memory/CPU block into a fast inter-connector. Rather reminds me of the the old S100 bus where everything just plugged into the same channel, (which probably dates me quite well).
I honestly can't imagine having something in my DIMM slots with this bad of latency and throughput. 150 microsecond reads? Doing a full POST would take forever. What OS and software use - case does this serve?
You'd put a filesystem on it and mount it as a "ramdisk", with the result that you can cut out expensive RAID controllers and/or PCIe SSDs, and can potentially reduce form factor.
E.g. I can get 1U servers with 32 DDR3 slots and 64 cores, but only space for 3x 3.5" drives, or if you double up (but few chassis will then let you use hot swap caddies), 6x 2.5" drives.
You could easily fit 256GB RAM and still have 16 slots free for RAID arrays over those DIMMs.
[+] [-] NamTaf|11 years ago|reply
[+] [-] luma|11 years ago|reply
Of course, all of this is pretty marketing heavy and not a lot of details have been released, so take this with a huge grain of salt. HP's CTO wrote a fluff piece on flattening the storage hierarchy here: http://www8.hp.com/hpnext/posts/end-necessary-evil-collapsin...
[+] [-] petercooper|11 years ago|reply
[+] [-] peter303|11 years ago|reply
For ordinary computing, maybe a terabyte of flash will finally be enough.
[+] [-] Shivetya|11 years ago|reply
Unless your writing at the machine level I really do not see the need for most programmers to ever concern themselves with storage, that should be the work of the underlying machine code and/or OS.
[+] [-] spitfire|11 years ago|reply
[+] [-] vibrolax|11 years ago|reply
[+] [-] amalag|11 years ago|reply
[+] [-] bduerst|11 years ago|reply
Aren't the latest versions of PCI-e SSDs already approaching early version DDR3 transfer rates?
[+] [-] film42|11 years ago|reply
[+] [-] piyush_soni|11 years ago|reply
[+] [-] IgorPartola|11 years ago|reply
I do have a hard time picturing what this will look like if it was as fast as traditional RAM. If I can store everything in RAM, from the OS binaries, to the running processes, it certainly has a kind of elegance to it. Lots of microcontrollers already act this way: all your hardware has an address in a single address space, be at a hardware port, ROM, RAM, NVRAM, etc. However, I wonder how the modern UNIX OS will work with a system like this without block devices at all. I wonder if what it'd do is actually just use RAM disks, and years from now we'll still be doing this the way we do double emulation of the TTY. That'd be kind of sad since it means we can never get away from the concept of old block devices.
On the other hand it's damn convenient to be able to just append to a file and not have to worry about reallocating it. This means that perhaps all we'd need is a filesystem that's designed to work over RAM rather than over block devices.
[1] http://en.wikipedia.org/wiki/I-RAM
[+] [-] eslaught|11 years ago|reply
[+] [-] itchyouch|11 years ago|reply
We run clusters of ephemeral linux hosts, and while we use disk, we also have plenty of hosts that run diskless. The disk mount point just becomes a tmpfs mount and the required minimal installation is made on a tmpfs partition which amounts to a trivial amount for hosting busybox and some conf files in /etc. Linux does appear to understand that tmpfs backed filestores are already in memory and do not need to synchronize pages and flush out dirty pages. That whole caching/memory layer for this fake block device is just not used at all.
[+] [-] 13|11 years ago|reply
[+] [-] rsync|11 years ago|reply
I don't think that's how it works - I think you get a special driver that presents the fast-storage-in-dimm-slot as a block device. So you still have a block device, it's just faster.
Right ?
[+] [-] kalleboo|11 years ago|reply
Didn't Palm OS originally work this way? The early devices were too cheap for flash memory for anything more than the base OS, so they just stored everything in battery-backed RAM. If your batteries ran out, your device was wiped and had to be re-synced. Apps were run in-place in RAM. Then they had to hack in a bunch of workarounds when they added flash memory storage.
[+] [-] mrb|11 years ago|reply
So SanDisk is likely quoting a worst case read latency and/or a best case read IOPS. Customers are left to themselves to figure out which of these numbers is most likely to represent the average performance...
PS: here is a paper with some more details but that still fails to explain this discrepancy: http://www.snia.org/sites/default/files/SanDisk%20ULLtraDIMM...
[+] [-] nitrogen|11 years ago|reply
Maybe you can submit more than 8 queued requests to the controller and have them stream back at the full DDR3 data rate. Maybe it's not actually addressable as RAM, it just uses the DDR3 interface as a communication bus.
[+] [-] petercooper|11 years ago|reply
[+] [-] fpp|11 years ago|reply
and a video interview and demo with a SanDisk manager at a computer fair - http://www.youtube.com/watch?v=jarsTLGXx9c (currently only available for OEM, requires special bios setup)
[+] [-] ctz|11 years ago|reply
So about 4 unrecoverable errors per sector. Seems legit.
Do people not read their own copy?!
[+] [-] bitL|11 years ago|reply
[+] [-] zrail|11 years ago|reply
[+] [-] gnufied|11 years ago|reply
[+] [-] throwaway_yy2Di|11 years ago|reply
[+] [-] toyg|11 years ago|reply
Like an SSD.
> Is it just using the memory controller as a super fast parallel bus?
Afaik, it's this.
[+] [-] unknown|11 years ago|reply
[deleted]
[+] [-] kcarnold|11 years ago|reply
[+] [-] snake_plissken|11 years ago|reply
But I thought the main drawback of SSDs was that eventually the individual memory cells will degrade and lose the ability to write new data. I don't see anything about endurance other than a MTBF of 2.5 million hours and the disk-writes-per-day (DWPD), which I had to look up. I have no idea if these are good or bad. I feel like these things will be great if you didn't write new data a lot, or you have deep pockets to replace them when you need to.
[+] [-] Filligree|11 years ago|reply
On any kind of large-scale installation you need redundancy, but if those numbers pan out then you could expect any single SSD to outlast your organisation.
Take backups anyway.
[+] [-] nine_k|11 years ago|reply
[+] [-] userbinator|11 years ago|reply
I think the market for this could be much bigger if it behaved like a regular RAM DIMM, only slower and nonvolatile; it somewhat reminds me of old machines that used magnetic core RAM. This could be useful for laptops, like a zero-power suspend-to-(NV)RAM. The only thing that is worrying is the endurance of the flash - especially if it's being treated almost like RAM in this application.
[+] [-] jgrodziski|11 years ago|reply
[+] [-] zitterbewegung|11 years ago|reply
[+] [-] mrb|11 years ago|reply
SATA 3.2 (2000MB/s) would help for throughput but not for latency. It is also still quite uncommon even amongst latest generation servers, whereas this SSD is compatible with many current and older generation servers.
[+] [-] gnufied|11 years ago|reply
I don't think this SSD can be accessed through regular SCSI, IDE, AHCI controllers. http://www.sandisk.com.br/enterprise/ulltradimm-ssd/
[+] [-] saticmotion|11 years ago|reply
Of course, if it behaves like a regular SSD, the read speeds still exceed SATA3.0 speed.
[+] [-] brink|11 years ago|reply
[+] [-] sfilipov|11 years ago|reply
[0] http://youtu.be/dGfpdciV9B4?t=1m32s
[+] [-] imaginenore|11 years ago|reply
http://hothardware.com/Reviews/OCZ-CES-2012-Product-Tour-ZDr...
[+] [-] Wildgoose|11 years ago|reply
[+] [-] jpgvm|11 years ago|reply
[+] [-] Igglyboo|11 years ago|reply
[+] [-] yelnatz|11 years ago|reply
https://www.youtube.com/watch?v=xqcox3Da0c4
[+] [-] unwind|11 years ago|reply
Also available in a non-Brazilian page at http://www.sandisk.com/enterprise/ulltradimm-ssd/, of course. :)
[+] [-] robmccoll|11 years ago|reply
[+] [-] vidarh|11 years ago|reply
E.g. I can get 1U servers with 32 DDR3 slots and 64 cores, but only space for 3x 3.5" drives, or if you double up (but few chassis will then let you use hot swap caddies), 6x 2.5" drives.
You could easily fit 256GB RAM and still have 16 slots free for RAID arrays over those DIMMs.
[+] [-] fastest963|11 years ago|reply