top | item 37227807

(no title)

jron | 2 years ago

Are there major advantages of GPT-3.5 Turbo tuning over PEFT/LoRA with Llama2?

discuss

Latency and cost. GPT-3.5-Turbo is very very fast (for reasons I still don't understand) and cost is very very low even with the finetuning premium.

Llama2 is still slow even with all the LLM inference tricks in the book and you need to pay for expensive GPUs to get it to a production-worthy latency, along with a scaling infra if there is a spike in usage.

eldenring|2 years ago

GPT-3.5 is much, much smarter than Llama2. Its not nearly as close as the benchmarks make it seem.

Tostino|2 years ago

So, as somebody who has fine tuned llama2 (13b) on both a new prompt template / chat format, as well as instruction following, summarization, knowledge graph creation, traversing a knowledge graph for information, describing relationships in the knowledge graph, etc.

It is able to use the knowledge graph to write coherent text that is well structured, lengthy, and follows the connections outlined in the graph to the logical conclusions, while deriving non-explicit insights from the graph in it's writings.

Just to say, i've seen a giant improvement in performance from Llama2 by fine tuning. And like I said, just 13b...I am perfecting the dataset with 13b before moving to 70b.

3.5-turbo is sometimes okay, i've tested it moderately for the same tasks i've been training/testing Llama2 on, and it's just a bit behind. Honestly, my fine tune is more consistent than gpt4 for a good number of the tasks i've trained.

intellectronica|2 years ago

Indeed, and this is really missing from the public discourse. People are talking about Llama 70b as if it was a drop-in replacement for gpt-3.5, but you only have to play with both for half an hour to figure out that's not generally the case and only looks true in cherry-picked examples.