top | item 42009589

(no title)

patelajay285 | 1 year ago

We've been working on a Python framework where one of the use cases is easy distillation from larger models to smaller open-source models and smaller-closed source models (where you don't have to still use / pay for the closed-source API service): https://datadreamer.dev/docs/latest/

Here's an (now slightly outdated) example of OpenAI GPT-4 => OpenAI GPT-3.5: https://datadreamer.dev/docs/latest/pages/get_started/quick_...

But you can also do GPT-4 to any model on HuggingFace. Or something like Llama-70B to Llama-1B.

For some tasks, this kind of distillation works extremely well given even a few hundred examples of the larger model performing the task.

discuss

order

bangaladore|1 year ago

> OpenAI GPT-4 => OpenAI GPT-3.5

I'm confused why you are mentioning 3.5 here. The weights aren't public, so you aren't actually running any derivative of GPT-3.5

Or am I mistaken. Can you clarify?

Tiberium|1 year ago

> distillation from larger models to smaller open-source models and smaller-closed source models

They don't limit it only to open-source models. And you can finetune 3.5 Turbo on OpenAI API.