top | item 43965634

(no title)

jas8425 | 9 months ago

If embeddings are roughly the equivalent of a hash at least insofar as they transform a large input into some kind of "content-addressed distillation" (ignoring the major difference that a hash is opaque whereas an embedding has intrinsic meaning), has there been any research done on "cracking" them? That is, starting from an embedding and working backwards to generate a piece of text that is semantically close by?

I could imagine an LLM inference pipeline where the next token ranking includes its similarity to the target embedding, or perhaps instead the change in direction towards/away from the desired embedding that adding it would introduce.

Put another way, the author gives the example:

> embedding("king") - embedding("man") + embedding("woman") ≈ embedding("queen")

What if you could do that but for whole bodies of text?

I'm imagining being able to do "semantic algebra" with whole paragraphs/articles/books. Instead of just prompting an LLM to "adjust the tone to be more friendly", you could have the core concept of "friendly" (or some more nuanced variant thereof) and "add" it to your existing text, etc.

discuss

order

luke-stanley|9 months ago

"starting from an embedding and working backwards to generate a piece of text that is semantically close by?" Apparently this is called embedding inversion and Universal Zero-shot Embedding Inversion https://arxiv.org/abs/2504.00147 Going incrementally closer and closer to the target with some means to vary seems to be the most general way, there are lots of ways to be more optimal though. Image diffusion with CLIP embeddings and such is kinda related too.

luke-stanley|9 months ago

I meant to say: Apparently this is called "embedding inversion", and that "Universal Zero-shot Embedding Inversion" is a related paper that covers a lot of the basics. Recently I learned that a ArXiv RAG agent by ArXiv Labs is a really cool way for people wanting to find out about research: https://www.alphaxiv.org/assistant Though I had ran into "inversion" before, the AlphaXiv Assistant introduced me to "embedding inversion".

smokel|9 months ago

Not an expert in the field, but apparently there has been some research into this. It's called inference-time intervention [1], [2].

[1] "Steering Language Models With Activation Engineering", 2023, https://arxiv.org/abs/2308.10248

[2] "Multi-Attribute Steering of Language Models via Targeted Intervention", 2025, https://arxiv.org/pdf/2502.12446

jerjerjer|9 months ago

> If embeddings are roughly the equivalent of a hash

Embeddings are roughly the equivalent of fuzzy hashes.

quantadev|9 months ago

A hash is a way of mapping a data array to a more compact representation that only has one output with the attribute of uniqueness and improbability of collision. This is the opposite of what embeddings are for, and what they do.

Embeddings are a way of mapping a data array to a different (and yes smaller) data array, but the goal is not to compress into one thing, but to spread out into an array of output, where each element of the output has meaning. Embeddings are the exact opposite of hashes.

Hashes destroy meaning. Embeddings create meaning. Hashes destroy structure in space. Embeddings create structures in space.

kaycebasques|9 months ago

Follow-up question based on your semantic algebra idea. If you can start with an embedding and generate semantically similar text, does that mean that "length of text" is also one of the properties that embeddings capture?

jas8425|9 months ago

I'm 95% sure that it does not, at least as far as the the essence of any arbitrary concept does or doesn't relate to the "length of text". Theoretically you should just as easily be able to add or subtract embeddings from a book as a tweet, though of course the former would require more computation than the latter.