top | item 44223944

(no title)

theOGognf | 8 months ago

Some anecdotal data, but we recently estimated the cost of running a LLM at $WORK by looking at power usage over a bursty period of requests from our internal users and it was on the order of $10s/mil tokens. And we arent a big place, nor were our servers at max load, so I can see the cost being much lower at scale

discuss

exceptione|8 months ago

This is only the power usage?

theOGognf|8 months ago

Right, this is only power usage. Factoring in labor and all that would make it more expensive for sure. However, it’s not like it’s a complex system to maintain. We use a popular inference server and just run it with some modest rate limits . It’s been hands-off for close to a year at this point

dist-epoch|8 months ago

Hardware spend also need to be amortized (over 1 year? 2 years?) Unless you cloud rent.