(no title)
PhunkyPhil | 10 days ago
I'm out of the loop on training LLMs, but to me it's just pure data input. Are they choosing to include more code rather than, say fiction books?
PhunkyPhil | 10 days ago
I'm out of the loop on training LLMs, but to me it's just pure data input. Are they choosing to include more code rather than, say fiction books?
refulgentis|10 days ago
I desperately want there to be differentiation. Reality has shown over and over again it doesn’t matter. Even if you do same query across X models and then some form of consensus, the improvements on benchmarks are marginal and UX is worse (more time, more expensive, final answer is muddied and bound by the quality of the best model)
stephenbez|8 days ago
I did some Googling and it appears that there are some examples where people say combining multiple models or multiple runs of the same models leads to improvements: https://www.sciencedirect.com/science/article/abs/pii/S00104... https://arxiv.org/abs/2203.11171
But presumably people are less likely to publish a paper when an approach doesn’t work.
IanCal|8 days ago
SignalStackDev|9 days ago
[deleted]
jmalicki|10 days ago
From there you go to RL training, where humans are grading model responses, or the AI is writing code to try to pass tests and learning how to get the tests to pass, etc. The RL phase is pretty important because it's not passive, and it can focus on the weaker areas of the model too, so you can actually train on a larger dataset than the sum of recorded human knowledge.