top | item 46666504

(no title)

sgt101 | 1 month ago

How to know if one should fine tune/pretrain or RL / reasoning train given some data set?

discuss

galsapir|1 month ago

i honestly dont think there's a simple y/n answer there - i think considerations include mostly like 'how costly it is to do so', 'how often do you think you'll need it', and so on. traces are not as "ephemeral" as FT models - since you can use those to guide agent behaviour when a newer model is released (but still, not as evergreen as other assets - traces generated using say GPT4 would seem pale and outdated compared to ones created on the same dataset using Opus4.5 i reckon)