(no title)
jpf0 | 3 years ago
Rules of thumb on memory usage: Python/Pandas (not memory-mapped): "In Pandas, the rule of thumb is needing 5x-10x the memory for the size of your data.” R (not memory-mapped): "A rough rule of thumb is that your RAM should be three times the size of your data set.” Jd: "In general, performance will be good if available ram is more than 2 times the space required by the cols typically used in a query.”
Re CSV reading, Jd has a fast CSV reader whereas J itself does not. I have written an Arrow integration to enable J to get to that fast CSV reader and read Parquet.
No comments yet.