holomorphiclabs's comments

holomorphiclabs | 1 year ago | on: Ask HN: Most efficient way to fine-tune an LLM in 2024?

Our findings are that RAG does not generalize well when critical understanding is shared over a large corpus of information. We do not think it is a question of either context length or retrieval. In our case it is very clearly capturing understanding within the model architecture itself.

holomorphiclabs | 1 year ago | on: Ask HN: Most efficient way to fine-tune an LLM in 2024?

We are finding there is a trade-off between model performance and hosting costs post-training. The optimal outcome is where we have a model that performs well on next-token prediction (and some other in-house tasks we've defined) that ultimately results in a model that we can host on the lowest-cost hosting provider rather than be locked in. I think we'd only go the proprietary model route if the model really was that much better. We're just trying to save our selves weeks/months of benchmarking time/costs if there was already an established option in this space.
page 1