The cuDF interop in the roadmap [1] will be huge for my workloads. XGBoost has the fastest inference time on GPUs, so a fast path straight from these Vortex files to GPU memory seems promising.
Can you explain how it’s faster? GPU memory is just a blob with an address. Is it because the loading algorithms for vortex align better with XGBoost or just plain uploading to the GPU?
What you can do if you have gpu friendly format is you send compressed data over PCI-E and then decompress on the gpu. Thus your overall throughput will increase since PCI-E bandwidth is the limiting factor of the overall system.
reactordev|3 months ago
robert3005|3 months ago
kipukun|3 months ago