top | item 45738562

(no title)

lsb | 4 months ago

Curious about comparisons with Apache Arrow, which uses flatbuffers to avoid memory copying during deserialization, which is well supported by the Pandas ecosystem, and which allows users to serialize arrays as lists of numbers that have hardware support from a GPU (int8-64, float)

discuss

order

chaokunyang|4 months ago

Apache Arrow is more of a memory format than a general‑purpose data serialization system. It’s great for in‑memory analytics and GPU‑friendly columnar storage.

Apache Fory, on the other hand, has its own wire‑stream format designed for sending data across processes or networks. Most of the code is focused on efficiently converting in‑memory objects into that stream format (and back) — with features like cross‑language support, circular reference handling, and schema evolution.

Fory also has a row format, which is a memory format, and can complement or compete with Arrow’s columnar format depending on the use case.