Hosting your own LLM is anything but free. Aside from the constant operational expense with people monitoring and fixing issues, you need to provision enough resources and run your own inference server, which is both nontrivial and likely to perform far worse than OpenAI. There's legitimate reasons to host an LLM yourself, but it's not a "make this cheaper" button.
There may be a tipping point where you're burning XXM/year in API costs and the maintenance cost of rolling your own can be justified.
In the short term I agree, and one thing to consider is how rapidly the space is evolving and whether your team can even keep up with the latest advancements.
However, there will come a time when the bill comes due after launch and it will be very tempting to hire people to reduce the CapEx on the API.
phillipcarter|2 years ago
kuchenbecker|2 years ago
In the short term I agree, and one thing to consider is how rapidly the space is evolving and whether your team can even keep up with the latest advancements.
However, there will come a time when the bill comes due after launch and it will be very tempting to hire people to reduce the CapEx on the API.
rimeice|2 years ago