top | item 45428525

(no title)

trjordan | 5 months ago

It really seems like all the next big leaps in AI are going to be fine-tuning fit-for-purpose models.

Everything past GPT5 has been ... fine. It's better at chat (sort of, depending on your tone preferenc) and way better at coding/tool use. In our product (plan out a migration with AI), they've gotten worse, because they want to chat or code. I'd have expected the coding knowledge to generalize, but no! Especially Claude really wants to change our code or explain the existing plan to me.

We're getting around it with examples and dynamic prompts, but it's pretty clear that fine-tuning is in our future. I suspect most of the broad-based AI success is going to look like that in the next couple years.

discuss

lenerdenator|5 months ago

We'll need to find a way to make fine-tuning happen on consumer hardware. I hope we do that sooner rather than later. $196 is not awful, but still pretty high up on the cost side for hobbyists.

selim-now|5 months ago

well, fine-tuning is possible on consumer hardware, the problem is that it would be slow and that you're limited in the size of the dataset you can use in the process.

In case you would want to follow the approach in this paper and synthetically augment a dataset – using an LLM for that (instead of a smaller model) just makes sense and then the entire process cannot be easily run on your local machine.

derac|5 months ago

GPT 5 came out less than two months ago, lol.