(no title)
dnnssl2 | 2 years ago
1. Generalizing new facts. You can create a question answer pair of: “what is the population of the world in 2023?” “8 billion”, but it may not be able to pick up alternate phrasing or “does the world have 8 billion people on it?”
2. Catastrophic and behavioral forgetting. Continued fine tuning after RLHF and instruction fine tuning may result in the loss of the alignment and instruction following capabilities trained by OpenAI. At worst, it will start spewing random tokens like the example in the post.
I have not yet seen it successfully done, and I suspect that updating fractions (~.1%) of the original weights with PEFT methods won’t help.
BoorishBears|2 years ago
Current fine tuning techniques can only contribute to knowledge indirectly (getting better queries for an external data source for example), you cannot directly embed new facts in the model is any generally efficient/effective manner.
There are toy examples of fine tuning in facts that are not of use outside of academic considerations at this point, and I sense it's contributing to the widespread confusion about fine-tuning's value proposition
dnnssl2|2 years ago
I don’t believe that the answer is strictly no. There are still many questions around the fine tuning method and the scale of data, as well as expectations of task accuracy from the perspective of an end user.
ozr|2 years ago
redox99|2 years ago
Nitpick, but although when training LoRAs you're only training 1% or less (depending on rank) of the number of parameters of the entire model, the adapters affect the entire model and after merging the LoRA all of the weights of the model are updated.