top | item 34985848

Introducing ChatGPT and Whisper APIs

1423 points| minimaxir | 3 years ago |openai.com

668 comments

order
[+] minimaxir|3 years ago|reply
> It is priced at $0.002 per 1k tokens, which is 10x cheaper than our existing GPT-3.5 models.

This is a massive, massive deal. For context, the reason GPT-3 apps took off over the past few months before ChatGPT went viral is because a) text-davinci-003 was released and was a significant performance increase and b) the cost was cut from $0.06/1k tokens to $0.02/1k tokens, which made consumer applications feasible without a large upfront cost.

A much better model and a 1/10th cost warps the economics completely to the point that it may be better than in-house finetuned LLMs.

I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.

[+] vishal0123|3 years ago|reply
> I have no idea how OpenAI can make money on this.

I did some quick calculation. We know the number of floating point operations per token for inference is approximately twice the number of parameters(175B). Assuming they use 16 bit floating point, and have 50% of peak efficiency, A100 could do 300 trillion flop/s(peak 624[0]). 1 hour of A100 gives openAI $0.002/ktok * (300,000/175/2/1000)ktok/sec * 3600=$6.1 back. Public price per A100 is $2.25 for one year reservation.

[0]: https://www.nvidia.com/en-us/data-center/a100/

[1]: https://azure.microsoft.com/en-in/pricing/details/machine-le...

[+] polygamous_bat|3 years ago|reply
> I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.

The worst thing that can happen to OpenAI+ChatGPT right now is what happened to DallE 2, a competitor comes up with an alternative (even worse if it's free/open like Stable Diffusion) and completely undercuts them. Especially with Meta's new Llama models outperforming GPT-3, it's only a matter of time someone else gathers enough human feedback to tune another language model to make an alternate ChatGPT.

[+] rtsil|3 years ago|reply
It is so massive that I can't help but think about what happened with Google Maps API a few years ago where they had extremely low pricing for years then hiked the price by 1400% once enough people were locked into applications based on that API.
[+] LrnByTeach|3 years ago|reply
This is playbook taken from Amazon prime.

This massive price cut, I believe, is intended to undercut competing open source ChatGPT equivalent initiatives.

OpenAI/Micorsoft may be losing money with this new pricing, but that is on purpose. At these lower prices most of the OpenSource alternatives in the works will have difficult time continuing projects.

After few years, when most open source alternatives have died, OpenAI/Microsoft will gradually raise the prices.

This is the same strategy that Amazon Prime used for many years, losing money on shipping. Once the competition was eliminated, Amazon Prime prices steadily increased.

[+] ar9av|3 years ago|reply
Pricing of this model seems less per token level but you have to send the entire conversation each time, and the tokens you will be billed for include both those you send and the API's response (which you are likely to append to the conversation and send back to them, getting billed again and again as the conversation progresses). By the time you've hit the 4K token limit of this API, there will have been a bunch of back and forth - you'll have paid a lot more than 4K * 0.002/1K for the conversation.
[+] naillo|3 years ago|reply
ChatGPT runs a highly fine tuned (and pruned) version of `text-davinci-003` so it's probably much much smaller and thus cheaper than 003. Possibly as cheap as 10x less or as much as the `text-davinci-002` or earlier models anyway.
[+] alvis|3 years ago|reply
To be fair, cost is the only thing that is prohibiting applications to adapt GPT. Even when GPT-3 was cut to $0.02/1k tokens, still it wasn't economical to use the tech in daily basis without a significant cost. i.e. would you add $10 extra a month for a user using your app with GPT-3 capability? Some do, mainly content generation, but majority won't.

Seems like we're going to have a vast among of Chat-GTP backed application coming out in the coming short period of time

[+] NiekvdMaas|3 years ago|reply
It also seems to jeopardize their own ChatGPT Pro offering. It's a matter of time before someone makes a 1:1 clone for either half the money or a usage-based pricing model.
[+] CrypticShift|3 years ago|reply
How do these compare to the recent Default ("turbo") vs legacy" (for plus/pro) modes?

If "turbo" is "gpt-3.5-turbo", how to access the (better?) "legacy" by API?

[+] WolfOliver|3 years ago|reply
I think you will consume a lot of tokens very quickly as you have to send the entire chat history back and forth if you want to append another chat message.
[+] behnamoh|3 years ago|reply
I wish they would offer an uncensored version of it too. Also, I wish they would specify the differences between ChatGPT and GPT-3.5 because one is 10x cheaper than the other but with (supposedly) better chat/coding/summarizing performance. What's the catch?
[+] tin7in|3 years ago|reply
We just implemented text-davinci-003 and seeing a better model at 1/10 the price is almost unbelievable.
[+] shmatt|3 years ago|reply
Losing money to lock out competition has been something Microsoft has been very good at, historically
[+] binarymax|3 years ago|reply
It’s now subsidized by Bing advertisements. They will lose plenty of money but they’re after Google.
[+] triyambakam|3 years ago|reply
Can you explain what tokens are in this context?

Edit: and better yet, is there a good resource for learning the vernacular in general? Should I just read something like "Dive into Deep Learning"?

[+] danenania|3 years ago|reply
I'd imagine they're getting compute from Azure now at cost, if not less?
[+] generalizations|3 years ago|reply
> This has to be a loss-leader to lock out competitors before they even get off the ground.

This only a week or two after they were in the news for suggesting that we regulate the hardware required for running these models, in the name of "fighting misinformation". I think they're looking for anything possible to keep their position in the market. Because as other comments have pointed out, there isn't much of a moat.

[+] cm2012|3 years ago|reply
They now have Microsoft's incredibly huge compute in their back pocket.
[+] sama|3 years ago|reply
we make a little money on it!
[+] osigurdson|3 years ago|reply
>> may be better than in-house finetuned LLMs

I don't think this competes with fine-tuned models. One advantage of a fine tune is it makes use of your own data.

[+] taytus|3 years ago|reply
>I have no idea how OpenAI can make money on this

Microsoft.

[+] stevev|3 years ago|reply
Elon is making an open-source version. OpenAI either make it affordable or be left in the dust.
[+] visarga|3 years ago|reply
They probably shrunk the model from 175B to 17B. That's your 10:1 price reduction.
[+] pmx|3 years ago|reply
Well this has opened up a whole new world of fun! -----

system: You're donald trump! talk to the user as he would.

user: How does time travel work?

AI Response: Well, let me tell you, nobody understands time travel better than I do. I mean, I have the best brains working on it, believe me. Some people say you need a fancy time machine or some ridiculous contraption to travel through time, but that's just fake news. The truth is, it's all about the power of the mind. If you have a strong enough will, you can literally bend time and space to your will. And you know what? I have the strongest will of anyone on the planet. So believe me, I could time travel anytime I want. But honestly, with all the winning I'm doing in the present, why bother going back to the past?

[+] nonethewiser|3 years ago|reply
> So believe me, I could time travel anytime I want. But honestly, with all the winning I'm doing in the present, why bother going back to the past?

That line is money.

[+] sixstringtheory|3 years ago|reply
That is way too cogent. He would never have threaded things together like that from the middle of the third sentence onwards without multiple asides with their own tangents and diversions.
[+] warunsl|3 years ago|reply
DJT and "contraption", "time and space", really? ;-)
[+] hatchoo|3 years ago|reply
That must've been picked up from the old "Somewhere in Time" movie with Christopher Reeves.
[+] jfan001|3 years ago|reply
This is spot on!
[+] alexb_|3 years ago|reply
>Through a series of system-wide optimizations, we’ve achieved 90% cost reduction for ChatGPT since December

This is seriously impressive. A MILLION tokens for 2 dollars is absolutely fucking insane.

I hope that the gains reached here can also be found by open source and non-controlled AI projects. If so, that could be huge for the advancement of AI.

[+] r3trohack3r|3 years ago|reply
Whisper as an API is great, but having to send the whole payload upfront is a bummer. Most use cases I can build for would want streaming support.

Like establish a WebRTC connection and stream audio to OpenAI and get back a live transcription until the audio channel closes.

[+] jfan001|3 years ago|reply
We've been struggling with costs because our application chains together multiple calls to GPT to generate the output we want, and it was starting to be ~$0.08 per call which obviously isn't feasible for high volume applications.

This just made our business way more viable overnight lmao

[+] braingenious|3 years ago|reply
I hope this pricing impacts ChatGPT+

$20 is equivalent to what, 10,000,000 tokens? At ~750 words/1k tokens, that’s 7.5 million words per month, or roughly 250,000 words per day, 10,416 words per hour, 173 words per minute, every minute, 24/7.

I uh, do not have that big of a utilization need. It’s kind of weird to vastly overpay

[+] comment_ran|3 years ago|reply
Let me do quick estimate of the cost:

Given: $0.002 per 1k tokens

I tested it with asking this question:

#+begin_quote I want you to act as a travel guide. I will write you my location and you will suggest a place to visit near my location. In some cases, I will also give you the type of places I will visit. You will also suggest me places of similar type that are close to my first location. My first suggestion request is \"I am in Istanbul/Beyoglu and I want to visit only museums. #+end_quote

It costs:

#+begin_quote {"prompt_tokens":91,"completion_tokens":251,"total_tokens":342} #+end_quote

(/ 1000 342) ~3

If you ask 1000 questions, it will be like (* 0.002 1000) ~2.0 USD

It replies me:

#+begin_quote Great, if you're in Istanbul/Beyoglu and interested in visiting museums, here are some recommendations:

Istanbul Modern: This museum showcases contemporary art from Turkish and international artists. It's located in Karaköy, just across the Galata Bridge from Beyoglu.

Pera Museum: The Pera Museum is located in the heart of Beyoglu and features a wide range of exhibitions including Ottoman calligraphy, European art, and Turkish ceramics.

Rahmi M. Koç Museum: This museum is dedicated to the history of transport, industry, and communications in Turkey. It's located in Haskoy, a short taxi or ferry ride from Beyoglu.

Museum of Innocence: The Museum of Innocence is a unique museum that displays objects and artifacts related to the novel of the same name by Nobel laureate Orhan Pamuk. It's located in Çukurcuma, which is just a short walk from Beyoglu.

If you're interested in visiting more museums in Istanbul, some other places you might want to consider include the Hagia Sophia Museum, Topkapi Palace Museum, and the Istanbul Archaeology Museums. #+end_quote

Can't wait for hacking my Emacs again.....

[+] rkwasny|3 years ago|reply
Pricing is good because OpenAI does not need to make any money but needs data for feedback, if everyone switches to open source ( Llama etc. ) they won't get the data they need.

Google is testing their system internally with XX thousand users, OpenAI with XXX million users ...

[+] jonplackett|3 years ago|reply
This feels like the AI’s iPhone moment.

I am scared for all people working service jobs.

[+] thedangler|3 years ago|reply
Question, Can I give openAI some data for it to process so I can use it to my own advantage. Say I want to train it on specific topic of information I've gathered over the years. Can I some how give it that data and then I can use the API to get back out data in a chat or some other forms of questions?

I'm not too familiar with how it works.

[+] habitue|3 years ago|reply
Speculation: GPT-turbo is a new chinchilla optimal model with the equivalent capabilities as GPT-3.5. So it's literally just smaller, faster and cheaper to run.

The reason I don't think it's just loss-leading is that they made it faster too. That heavily implies a smaller model.

[+] superkuh|3 years ago|reply
I switched from their gpt3 davinci model to the new chatgpt3.5-turbo and everyone who used the IRC bot was really disappointed. The chatgpt3.5-turbo is incredibly puritanical and refuses to answer a good fraction of the type of input bots on IRC get. Even innocuous stuff. I was excited about the system/user prompts and being able to easily add short term memory and contexts but if it only responds with, "Oh, it wouldn't be right for me to talk about that." it's not fun.
[+] juliennakache|3 years ago|reply
Does anyone have a sense of how the Shopify integration work under the hood? Specifically, how were they able to have Chat-GPT makes a recommendation from their product database? Given the model cannot be fine-tuned and there is a 4000 token context limit, I'm guessing the only thing really possible is to have chat gpt extract a few keyword and generate a query in their internal product search and return the first one? Is there anything else that might going under the hood?
[+] bluecoconut|3 years ago|reply
Support for the ChatGPT endpoint now added to lambdaprompt[1]! (solves a similar problem as langchain, with almost no boilerplate!) Props to openai for making such a usable endpoint, was very easy to wrap.

Example code using the new function and endpoint:

    import lambdaprompt as lp
    convo = lp.AsyncGPT3Chat([{'system': 'You are a {{ type_of_bot }}'}])
    await convo("What should we get for lunch?", type_of_bot="pirate")
> As a pirate, I would suggest we have some hearty seafood such as fish and chips or a seafood platter. We could also have some rum to wash it down! Arrr!

(In order to use lambdaprompt, just `pip install lambdaprompt` and export OPENAI_API_KEY=...)

[1] https://github.com/approximatelabs/lambdaprompt

[+] rd|3 years ago|reply
This is incredibly cheap, it makes you wonder how in the world they managed to make it 10x cheaper than davinci-003 and still a better model? The world of robo-consulting is about to go insane.
[+] siva7|3 years ago|reply
I've stopped using Google entirely and don't miss it a second - something i wouldn't have thought possible a year ago - and it's pretty difficult to see how Google will survive this disaster.
[+] gdubs|3 years ago|reply
A little while back I built a personal project that ingested the book "A Pattern Language" as embeddings using the OpenAI API – and a front-end in SwiftUI that allows me to 'converse' with the book. [1]

Version 0 was built using the original daVinci model. Last night it took about literally five minutes to switch over to the new ChatGPT API – just a few changes to the request, including a new [message] array that replaces the old 'prompt' style. [2]

It's a pretty nice instant upgrade for the experience. Much faster results, and the responses are a lot more interesting. Doing something similar with daVinci would take a fair amount of 'prompt engineering' and fine-tuning. Not to mention all the manual conversation-state handling.

1: https://twitter.com/dreamwieber/status/1622634396087107584?s... 2: https://twitter.com/dreamwieber/status/1631327342697250816?s...

[+] georgehill|3 years ago|reply
Big news! Many apps will be integrating ChatGPT. Worried about AI-generated content flood the search engine, making it harder to do in-depth research.