Some anecdotal data, but we recently estimated the cost of running a LLM at $WORK by looking at power usage over a bursty period of requests from our internal users and it was on the order of $10s/mil tokens. And we arent a big place, nor were our servers at max load, so I can see the cost being much lower at scale
exceptione|8 months ago
theOGognf|8 months ago
dist-epoch|8 months ago