(no title)
datanecdote | 5 years ago
I am a Python data science user. If data gets big enough such that loading time is a bottleneck, I use parquet files instead of CSV, and PyArrow to load them into pandas. It’s a one line change. The creator of Pandas is now leading the Arrow project. It’s very seamless. Don’t know if I’m typical but that’s me.
ViralBShah|5 years ago
Jacob Quinn (karbacca) also has a Julia package for integrating Julia into the Arrow ecosystem: https://github.com/JuliaData/Arrow.jl
datanecdote|5 years ago