top | item 42489805

(no title)

exhaze | 1 year ago

I think the wildest thing is actually Meta’s latest paper where they show a method for LLMs reasoning not in English, but in latent space

https://arxiv.org/pdf/2412.06769

I’ve done research myself adjacent to this (mapping parts of a latent space onto a manifold), but this is a bit eerie, even to me.

discuss

ynniv|1 year ago

Is it "eerie"? LeCun has been talking about it for some time, and may also be OpenAI's rumored q-star, mentioned shortly after Noam Brown (diplomacybot) joining OpenAI. You can't hill climb tokens, but you can climb manifolds.

exhaze|1 year ago

I wasn’t aware of others attempting manifolds for this before - just something I stumbled upon independently. To me the “eerie” part is the thought of an LLM no longer using human language to reason - it’s like something out of a sci fi movie where humans encounter an alien species that thinks in a way that humans cannot even comprehend due to biological limitations.

I am hopeful that progress in mechanistic interpretability will serve as a healthy counterbalance to this approach when it comes to explainability.. though I kinda worry that at a certain point it may be that something resembling a scaling law puts an upper bound on even that.

madethisnow|1 year ago

Interesting paper on this. "Automated Search for Artificial Life" https://sakana.ai/asal/

Y_Y|1 year ago

> You can't hill climb tokens, but you can climb manifolds.

Could you explain this a bit please?

danielmarkbruce|1 year ago

It's just concept space. The entire LLM works in this space once the embedding layer is done. It's not really that novel at all.

ttul|1 year ago

This was my thought. Literally everything inside a neural network is a “latent space”. Straight from the embeddings that you use to map categorical features in the first layer.

Latent space is where the magic literally happens.

asadalt|1 year ago

kinda how we do it. language is just an io interface(but also neural obv) on top of our reasoning engine.

oceanparkway|1 year ago

It’s not just a protocol buffer for concepts though (weak wharf Sapir, lakoff’s ubiquitous metaphors). Language itself is also a concept layer and plasticity and concept development is bidirectional. But (I’m not very versed in the language here re ‘latent space’) I would imagine the forward pass through layers converges towards near-token-matches before output, so you have very similar reason to token/language reasoning even in latent/conceptual reasoning? Like the neurons that nearly only respond to a single token for ex.

rhubarbtree|1 year ago

Seems a standard approach of AI research is to “move X into the latent space” where X is some useful function (eg diffusion) previously done in the “data” or “artefact” space. So seems very pedestrian not wild to make that step.

mountainriver|1 year ago

There are lots of papers that do this