How Do You Program a Computer with 10 Terabytes of RAM?

[+] dekhn|10 years ago|reply

A 2TB RAM machine with lots of high end Xeon chips is only about $120K, probably less if you are a good negotiator.

I've programmed large RAM machines and it's not that hard. In general, it simply let me run programs I couldn't run previously because they allocated too much memory and crashed or swapped/paged too much.

Having a flat memory hierarchy (all RAM has the same cost) makes it dirt simple. NUMA made it significantly harder because you typically had to structure you program's data, threads to schedule on the appropriate cores or processors.

However, I've found over time that unless you absolutely need to hold all your data in RAM, then spending the additional money to get the largest DIMMs and a motherboard with tons of DIMM slots isn't really cost effective. So long as you can partition your problem, that is a preferred solution in nearly all cases. however, increased programmer productivity can often be more cost effective ("just buy more RAM") and I know people who have been upgrading their computers for years and running the exact same algorithms on their in-memory data sets at very high speeds for years.

Depending on your code, some complex problems can occur. For example, the TLB, which speeds virtual/physical address translations, has its own TLB, and if you're screaming all over memory you can blow out the TLB for the TLB. my experience has been the TLB issues go up with larger memory (since you tend to load more data on each node, and access it with sparser patterns).

I haven't looked at DIMM price curves or roadmaps lately, but I assume that 4TB will be $120K in another 3-5 years, then 8TB will be $120K in 10 years.

With all that said, SSDs greatly reduced the issue of needing lots of memory. I've done jobs that would have required too much physical RAM for my budget by configuring linux swap on a fast SSD. It swapped a lot, but the jobs ran :-)

[+] cmdrfred|10 years ago|reply

For the programmer who works with far more pedestrian hardware what does someone do with 2TB of ram that you wouldn't implement in lets say a rainbow table on a machine with lesser specs? If you are allowed to say that is.

[+] chillacy|10 years ago|reply

One interesting aspect is that we've built our computer architecture with the assumption that storage is tiered from expensive to cheap: L1-3 CPU cache, RAM, SSD, HD, maybe Tape.

If just went L1-3 + RAM, it would greatly simplify the job of programmers and programs. (theoretically) no need for virtual memory, buffering / flushing, etc.

[+] marssaxman|10 years ago|reply

You start by congratulating yourself for refusing to buy into the Java hype, then continue on with business as usual in C++ or some other non-garbage-collected language.

[+] vardump|10 years ago|reply

After that you discover it's a NUMA machine, you have a lot of inter-cpu communication and curse you have no GC. Why? Because you bottleneck inter-cpu communication so easily with object synchronization. With GC quite a bit of it can be avoided.

If Java just didn't need to allocate each damn small value object (think struct) separately, I think it'd be a lot better with large heaps.

Array of 1000 non-elementary values could be just one allocation, not 1001. That'd also be 1000 times faster to garbage collect.

[+] pron|10 years ago|reply

Quite the opposite. The only way to use so much RAM is with lots and lots of threads (running on lots and lots of cores), which means either sharding or sharing. Sharding scales really badly because when you do have cross-shard transactions or queries, you need expensive locks. A GC makes general-purpose, high-performance lock-free data structures much, much, much easier.

[+] rufusjones|10 years ago|reply

Use it to serve PHP pages. That oughta do 'er.

[+] dekhn|10 years ago|reply

Very easily.

[+] mring33621|10 years ago|reply

Using Java?

[+] hga|10 years ago|reply

Better use Azul's Zing unless you want worst case heap compactions to take 3 hours (maybe it's better now, per their Pauseless paper, I think, they said 1 second per GiB, but it would have to be a LOT better).

Their approach is to engineer for the worst case, then everything else is easy ^_^. More seriously, their collectors run all the time and GC the whole heap as well as of course nurseries more often.

I'd add that these systems often have NUMA effects in access times, even if it's one big address space, i.e. each chip has better access to the memory directly attached to it than memory on other chips. The mentioned SGI system uses stock Xeon chips, so it's got these sorts of issues I'm pretty sure.

13 comments