top | item 43761661

(no title)

thatnerd | 10 months ago

Sam Altman doesn't want you to say "please" to ChatGPT.

https://futurism.com/altman-please-thanks-chatgpt

discuss

esafak|10 months ago

That is not true; first he says it's "tens of millions of dollars well spent," followed by "you never know". I don't think he knows.

lyjackal|10 months ago

I’ve wondered whether they use thanks as a signal to a conversation well done, for the purpose of future reinforcement learning

itchyjunk|10 months ago

Casual twitter response turned into a new article turned into a "X wants Y" is exactly why I stopped trusting most of social media as a source of information.

hbbio|10 months ago

From the article: "the impacts of generating a 100-word email. They found that just one email requires .14 kilowatt-hours worth of electricity, or enough to power 14 LED lights for an hour"

Seems completely off the charts. A 70b model on my M3 Max laptop does it for 0.001 kWh... 140x times less that stated in the article. Let's say the OpenAI Nvidia clusters are less energy efficient than my Macbook... but not even sure about that.

ben_w|10 months ago

One can also work backwards, to see what kind of compute hardware they think must be needed for the models, or how much they think OpenAI's electricity costs.

100 words is ~133 tokens, so 0.14 kWh/133 tokens is about 1 kWh/kilo-token. If electricity is all from record-cheapest PV at $0.01/kWh, then this limits them to a price floor of $10/mega-token. For more realistic (but still cheap) pricing of $0.05/kWh, that's $50/mega-token. Here's the current price sheet: https://platform.openai.com/docs/pricing

To generate a 133-token email in, say, 5 seconds, if it takes 0.14 kWh, is 101 kW. This does not seem like a plausible number (caveat: I don't work in a data centre and what I think isn't plausible may just be wrong): https://www.wolframalpha.com/input?i=0.14+kWh+%2F+5+seconds+...

bayindirh|10 months ago

For reference, a single NVIDIA H200 card has a TDP of 700watts. Considering all the middlemen you put between you and the model, .14KWh doesn't look too outrageous to me. Because you add processors, high-speed interconnects, tons of cooling, etc. into the mix. Plus the models you run at the datacenters are way bigger.

For "fathomability" case, the network cables (fibers in fact) you use in that datacenters carries 800gbps, and the fiber-copper interface converters at each end heats up to uncomfortable levels. You have thousands of these just converting packets to light and vice versa. I'm not adding the power consumption of the switches, servers, cooling infra, etc. into the mix.

Yes, water cooling is more efficient than air cooling, but when a server is burning through 6KWh of energy (8x Tesla cards, plus processors, plus rest of the system), nothing is efficient as a local model you hit at your computer.

Disclosure: Sitting on top of a datacenter.

a3w|10 months ago

I end most conversations with a fuck you, then close the browser window. Since usually chatbots fail at the tasks I give them.