Want to hear about your experiences in terms of cost from various providers of LLM APIs. For example, for my usecase, I don't need to have a real time response. Are there any services that exploit spot instances and such, where I can submit a set of queries and get responses after a few hours, but costs much less?
dartos|1 year ago
You can generate 100s of responses per intent, so the user may not ever get the same response twice.
Ofc it depends on your use case, but smoke and mirrors are your friend.
yangikan|1 year ago