top | item 36463405

(no title)

simonster | 2 years ago

There are two steps to building a conversational LLM. The first is pretraining on an enormous amount of text. The second is fine-tuning, which usually involves a combination of a small amount of high-quality human data and reinforcement learning from human feedback (in practice, from another neural net trained to model human feedback).

This paper is about the quality of the pretraining. It is not necessarily going to be correlated with your subjective judgment of how good the model is. A good pretrained model without any fine-tuning will be very difficult to use for most purposes, because it won't do a very good job following instructions. However, assuming that the fine-tuning is done well, the quality of the pretraining determines the limits of the capabilities of the model. This tech report shows that the team did a good (or at least reasonable) job with the pretraining.

The primary audience for this post and tech report is (or at least should be) ML researchers that Inflection would like to recruit and technically knowledgeable investors, not end-users. To remain competitive, Inflection is gonna have to train a 10x more expensive model someday; OpenAI and Google already have. They need talent and investor $ to do that.

discuss

No comments yet.