(no title)
orderone_ai | 7 months ago
It's a vec2vec architecture - it takes in 3 bge-large embeddings of the task, the input data, and the vocabulary. It outputs 1 bge-large embedding of the answer.
That's the DSRU part.
What makes it a classifier is that later, outside of the model, we do a nearest neighbor search for our vocabulary items using our answer vector. So it will output something from the labels no matter what - the nearest neighbor search will always have something closest, even if the model went a little crazy internally.
The prompts here tend to be very straightforward. Things like: "Is this book review positive or negative?" "Is this person sharing something happy or venting?" "Determine the logical relationship between the premise and hypothesis. Answer with: entailment, neutral, or contradiction."
It has limited use cases, but where it's good, it should be very, very good - the insane speed, deterministic output, and forced label output makes it great for a lot of common, cheap tasks.
No comments yet.