(no title)
wesm | 5 years ago
Deserialization by definition requires bytes or bits to be relocated from their position in the wire protocol to other data structures which are used for processing. Arrow does not require any bytes or bits to be relocated. So if a "C array of doubles" is not native to the CPU, then I don't know what is.
throwaway894345|5 years ago
waynesonfire|5 years ago
My understanding is that an apache arrow library provides an API to manipulate the format in a platform agnostic way. But to claim that it eliminates deserialization is false.
dlnovell|5 years ago
ska|5 years ago
But ... (a) this is way less common than it was decades ago (rare use cases we are talking about here ) and (b) it seems to be addressed in a sensible way (i.e. Arrow defaults to little-endian, but you could swap it on a big-endian network). I think it includes utility functions for conversion also.
So the usual case incurs no overhead, and the corner cases are covered. I'm not sure exactly what you are complaining about, unless it's the lack of liberally sprinkling ("no deserialization in most use cases") or whatever around the comments?
johncolanduoni|5 years ago