top | item 35289717

(no title)

rnosov | 2 years ago

This has nothing to do with facebook. The foundational model here is GPT-J which is opensource and safe to use. Sadly, it is inferior to state-of-the-art models such as LLaMA.

discuss

order

Mizza|2 years ago

But they're "using data from Alpaca". I don't know what that means, isn't Alpaca using data generated by ChatGPT, which isn't "clean" to use? Or data from Facebook, which isn't "clean" to use? I'm drowning.

rnosov|2 years ago

They are instruction tuning it using the dataset released by stanford-alpaca team. The dataset itself is synthetic (created using GPT-3) and somewhat noisy and in my view can be easily recreated if OpenAI ever tries to go after it (which is very unlikely). Anyway, facebook has nothing to do with anything used by this project.

bilekas|2 years ago

I don't know the full details but Alpaca is from Stanford and only based on the LLamA (not a derivative work afaik). That said :

Also Meta's licensing here https://github.com/facebookresearch/llama/blob/main/LICENSE

Can't be sure what that license actually reffers to, the language model or just the tooling in the Git Repo.

I agree its a minefield, but with Meta I would eer on the side of caution.