top | item 35355045

(no title)

There's a difference between buzzwords and jargon. Buzzwords can start out as jargon, but have their technical meaning stripped by users who are just trying to sound persuasive. Examples include words like synergy, vertical, dynamic, cyber strategy, and NFT.

That's not what's happening in the parent comment. They're talking about projects like

https://github.com/ZrrSkywalker/LLaMA-Adapter

https://github.com/microsoft/LoRA

https://github.com/tloen/alpaca-lora

and specifically the paper: https://arxiv.org/pdf/2106.09685.pdf

Lora is just a way to re-train a network for less effort. Before we had to fiddle with all the weights, but with Lora we're only touching 1 in every 10,000 weights.

The parent comment says GPT4all doesn't give us a way to train the full size Llama model using the new lora technique. We'll have to build that ourselves. But it does give us a very huge and very clean dataset to work with, which will aid us in the quest to create an open source chatGPT killer.

discuss

No comments yet.