top | item 38200834

(no title)

Thank you! We are also very excited about combining the fast fine-tuning and efficient serving. In fact, what you just said is very related to one of our very first motivations. In my previous blog post [1], I call this scheme "Just-in-time Fine-tuning". Our previous measurement is that, for a medium-sized webpage (~10K tokens), it takes around 30 seconds to 2 minutes to finetune a LoRA model. Another good side of this JIT fine-tuning scheme is that, we can turn any model into a long-context model.

We'll keep doing more research on finetuning. And hopefully, we'll see the results soon.

[1] https://le.qun.ch/en/blog/2023/09/11/multi-lora-potentials/

discuss

No comments yet.