top | item 10870399

(no title)

ajbonkoski | 10 years ago

"we have a particular use-case where we have a large contiguous array with, say, 100,000 lines and 1000 columns"

This is where they lost me. This is NOT a lot of data. Should we be surprised that memory-mapping works well here?

Below about 100-200 GB you can do everything in memory. You simply don't need fancy file-systems. These systems are for actual big data sets where you have several terabytes to several petabytes.

Don't try to use a chainsaw to cut a piece of paper and then complain that scissors work better. Of course they do...

discuss

order

rossant|10 years ago

Unfortunately our users can't afford fancy computers with hundreds of GB of RAM. They often need to process entire datasets on laptops with 16GB of RAM but 1TB+ GB of disk space. Of course with 200 GB of RAM with have no problem at all...

Also, as I said, the 100,000 x 1000 example is a quite optimistic one, we do have cases now with 100,000,000 x 10,000 arrays, and this is only going to increase in the months to come with the new generation of devices.