top | item 40324592

(no title)

fredliu | 1 year ago

Does anyone have real life experience (preferably verified in production environment) of fine-tuning actually adding new knowledge to the existing LLM in a reliable and consistent manner? I've seen claims that fine-tuning only adapt the "forms" but can't adding new knowledge, while some claim otherwise. I couldn't convince myself either way with my limited adhoc/anecdotal experiments.

discuss

ozr|1 year ago

I’ve taught LLMs imaginary words and their meanings with minute amounts of data (two or three examples) via full fine-tuning, LoRA and QLoRA.

I have no idea where the myth of ‘can’t add new knowledge via fine-tuning’ came from. It’s a sticky meme that makes no sense.

Pretraining obviously adds knowledge to a model. The difference between pretraining and fine-tuning is the number of tokens and learning rate. That’s it.

mvkel|1 year ago

It seems like few shot prompting and providing some examples to LLMs with large context windows vastly out performs any amount of rag, or fine tuning.

Aren't rag and fine tuning fundamentally flawed, because they only play at the surface of the model? Like sprinkles on the top of the cake, expecting them to completely change the flavor. I know LoRA is supposed to appropriately weight the data, but the results say that's not the solution.

Also anecdotal, but way less work!

lmeyerov|1 year ago

Long context windows get confused, so shorter is better, and they cannot fit everything in general. I'm not sure where you are seeing results that say otherwise.

RAG is effectively prompt context optimization, so categorically rejecting doing that doesn't make sense to me. Maybe if models internalized that or scaled... But they don't.

ozr|1 year ago

RAG and fine-tuning are very different. Few-shot prompting and RAG are both variants of in-context learning.

fredliu|1 year ago

That's definitely my experience as well, sufficiently large context window with a capable enough general purpose LLM solves lots if not all of the problems rag/fine tuning claim to solve.

unreal6|1 year ago

I've also found (anecdotal) significant success in just throwing in available context before prompting. I've written multiple automations in this way as well.

simonw|1 year ago

I asked this on Twitter a few weeks ago and didn't manage to dig out any examples: https://twitter.com/simonw/status/1786163920388177950

lmeyerov|1 year ago

Afaict gorilla, as in that thread ;-)

Nexusflow probably too, as it also does function calling and would need to bake in, or explicit fine-tuning for RAG use, which I don't recall seeing

I haven't look recently, but there is also a cool category of models that provide GIS inferencing via LLM

unknown|1 year ago

[deleted]

coder543|1 year ago

This blog post I saw recently might be relevant: https://refact.ai/blog/2024/fine-tuning-on-htmlx-making-web-...

fredliu|1 year ago

Yeah... So looks like at least it's still an open question. I guess until we can definitively know how "knowledge" is collectively represented among the weights, it's hard to say either way. The other part of the question is how to evaluate the existence of "knowledge" in an LLM. TFA suggests a way, but still not 100% convinced that's THE way...

unknown|1 year ago

[deleted]

mmoskal|1 year ago

TFA says you can teach it new facts, but it's very slow and makes the model hallucinate more.

theendisney|1 year ago

A new dark age incomming

emersonrsantos|1 year ago

Ice age

andy99|1 year ago

Not really answering your question, but all the "alignment" of the big models is done through a combination of supervised fine tuning and RLHF. So all the chat and censorship and other specific behaviors are at least in part fine tuned in. Maybe that is closer to forms rather than actually knowing more...