top | item 44567384

(no title)

Ah, that's the beauty of it! It's not an LLM. It's a new class of model: A DSRU / Direct Semantic Reasoning Unit.

It's a vec2vec architecture - it takes in 3 bge-large embeddings of the task, the input data, and the vocabulary. It outputs 1 bge-large embedding of the answer.

That's the DSRU part.

What makes it a classifier is that later, outside of the model, we do a nearest neighbor search for our vocabulary items using our answer vector. So it will output something from the labels no matter what - the nearest neighbor search will always have something closest, even if the model went a little crazy internally.

The prompts here tend to be very straightforward. Things like: "Is this book review positive or negative?" "Is this person sharing something happy or venting?" "Determine the logical relationship between the premise and hypothesis. Answer with: entailment, neutral, or contradiction."

It has limited use cases, but where it's good, it should be very, very good - the insane speed, deterministic output, and forced label output makes it great for a lot of common, cheap tasks.

discuss

No comments yet.