(no title)
cpard | 15 days ago
Lance[1] (the format, not just LanceDB) is a great example, where you have columnar storage optimized for both analytical operations and vector workloads together with built-in versioning for dataset iteration.
Plus (very important) random access, which is important for stuff like sampling and efficient filtering during curation but also for working with multimodal data, e.g. videos.
Lance is not alone, vortex[2] is another one, nimble[3] from Meta yet another one and I might be missing a few more.
[1] https://github.com/lance-format/lance [2] https://vortex.dev [3] https://github.com/facebookincubator/nimble
No comments yet.