This looks really cool. One thing I've wondered about with, e.g., the OpenAI API is if json is really a good format for passing embeddings back and forth. I'd think that passing floats as text over the wire wastes a ton of space that could add up, and might even sacrifice some precision in. Would it be better to encode at least the vectors as binary blobs, or else use something like protobuf to more efficiently handle sending tons of floats around?
zh217|2 years ago
eigenvalue|2 years ago
Edit: One other thing is that you can store the JSON in SQLite using the JSON data type and then use the nice querying constructs directly at the database level, which is nice for the token-level embeddings and document embeddings. This is built in to my project.