top | item 38689654

(no title)

LLMLingua uses a well-trained small language model after alignment, such as GPT2-small or LLaMA-7B, to detect the unimportant tokens in the prompt and enable inference with the compressed prompt in black-box LLMs, achieving up to 20x compression with minimal performance loss.

discuss

cyanydeez|2 years ago

“Why waste time say lot word when few word do trick” -Kevin Malone

arthurcolle|2 years ago

Perfection. Key insight. "Few Word [is] All Need" (with a robust enough foundation model)

Linked for the culture: https://www.youtube.com/watch?v=bctjSvn-OC8&t=4s

Sleep big last night

unknown|2 years ago

[deleted]

sroussey|2 years ago

What would happen if instead of the long prompt, you just sent the mean of the embeddings of the prompt tokens?

behnamoh|2 years ago

Came here to mention this. Whenever I hear "alignment" I immediately say "No way am I going to use that shit". Seriously, there's alignment and then there's censorship—the AI creators are using the former when they actually mean the latter. This needs to stop.

TarqDirtyToMe|2 years ago

My understanding is that in an academic context you’ll hear alignment anytime a model is tuned to accomplish a certain task, not just to steer its political affiliation and idea of ethics

I don’t think this models use of alignment implies any sort of censorship, it’s just being tuned to accomplish the task of outputting only important tokens for the target llm

nathan_compton|2 years ago

It amazes me that this amazing new technology comes out and there is a group of people who are like "NO, NOT IF IT CAN'T TELL RACIST JOKES!"

I agree that like "tone" alignment is silly and pointless for models in the public domain, but if I were a big company who wanted to keep customers I'd align my models this way. It isn't censorship, its marketing.