> It is priced at $0.002 per 1k tokens, which is 10x cheaper than our existing GPT-3.5 models.
This is a massive, massive deal. For context, the reason GPT-3 apps took off over the past few months before ChatGPT went viral is because a) text-davinci-003 was released and was a significant performance increase and b) the cost was cut from $0.06/1k tokens to $0.02/1k tokens, which made consumer applications feasible without a large upfront cost.
A much better model and a 1/10th cost warps the economics completely to the point that it may be better than in-house finetuned LLMs.
I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.
> I have no idea how OpenAI can make money on this.
I did some quick calculation. We know the number of floating point operations per token for inference is approximately twice the number of parameters(175B). Assuming they use 16 bit floating point, and have 50% of peak efficiency, A100 could do 300 trillion flop/s(peak 624[0]). 1 hour of A100 gives openAI $0.002/ktok * (300,000/175/2/1000)ktok/sec * 3600=$6.1 back. Public price per A100 is $2.25 for one year reservation.
> I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.
The worst thing that can happen to OpenAI+ChatGPT right now is what happened to DallE 2, a competitor comes up with an alternative (even worse if it's free/open like Stable Diffusion) and completely undercuts them. Especially with Meta's new Llama models outperforming GPT-3, it's only a matter of time someone else gathers enough human feedback to tune another language model to make an alternate ChatGPT.
It is so massive that I can't help but think about what happened with Google Maps API a few years ago where they had extremely low pricing for years then hiked the price by 1400% once enough people were locked into applications based on that API.
This massive price cut, I believe, is intended to undercut competing open source ChatGPT equivalent initiatives.
OpenAI/Micorsoft may be losing money with this new pricing, but that is on purpose. At these lower prices most of the OpenSource alternatives in the works will have difficult time continuing projects.
After few years, when most open source alternatives have died, OpenAI/Microsoft will gradually raise the prices.
This is the same strategy that Amazon Prime used for many years, losing money on shipping. Once the competition was eliminated, Amazon Prime prices steadily increased.
Pricing of this model seems less per token level but you have to send the entire conversation each time, and the tokens you will be billed for include both those you send and the API's response (which you are likely to append to the conversation and send back to them, getting billed again and again as the conversation progresses). By the time you've hit the 4K token limit of this API, there will have been a bunch of back and forth - you'll have paid a lot more than 4K * 0.002/1K for the conversation.
ChatGPT runs a highly fine tuned (and pruned) version of `text-davinci-003` so it's probably much much smaller and thus cheaper than 003. Possibly as cheap as 10x less or as much as the `text-davinci-002` or earlier models anyway.
To be fair, cost is the only thing that is prohibiting applications to adapt GPT. Even when GPT-3 was cut to $0.02/1k tokens, still it wasn't economical to use the tech in daily basis without a significant cost. i.e. would you add $10 extra a month for a user using your app with GPT-3 capability? Some do, mainly content generation, but majority won't.
Seems like we're going to have a vast among of Chat-GTP backed application coming out in the coming short period of time
It also seems to jeopardize their own ChatGPT Pro offering. It's a matter of time before someone makes a 1:1 clone for either half the money or a usage-based pricing model.
I think you will consume a lot of tokens very quickly as you have to send the entire chat history back and forth if you want to append another chat message.
I wish they would offer an uncensored version of it too. Also, I wish they would specify the differences between ChatGPT and GPT-3.5 because one is 10x cheaper than the other but with (supposedly) better chat/coding/summarizing performance. What's the catch?
> This has to be a loss-leader to lock out competitors before they even get off the ground.
This only a week or two after they were in the news for suggesting that we regulate the hardware required for running these models, in the name of "fighting misinformation". I think they're looking for anything possible to keep their position in the market. Because as other comments have pointed out, there isn't much of a moat.
Well this has opened up a whole new world of fun!
-----
system: You're donald trump! talk to the user as he would.
user: How does time travel work?
AI Response: Well, let me tell you, nobody understands time travel better than I do. I mean, I have the best brains working on it, believe me. Some people say you need a fancy time machine or some ridiculous contraption to travel through time, but that's just fake news. The truth is, it's all about the power of the mind. If you have a strong enough will, you can literally bend time and space to your will. And you know what? I have the strongest will of anyone on the planet. So believe me, I could time travel anytime I want. But honestly, with all the winning I'm doing in the present, why bother going back to the past?
That is way too cogent. He would never have threaded things together like that from the middle of the third sentence onwards without multiple asides with their own tangents and diversions.
>Through a series of system-wide optimizations, we’ve achieved 90% cost reduction for ChatGPT since December
This is seriously impressive. A MILLION tokens for 2 dollars is absolutely fucking insane.
I hope that the gains reached here can also be found by open source and non-controlled AI projects. If so, that could be huge for the advancement of AI.
We've been struggling with costs because our application chains together multiple calls to GPT to generate the output we want, and it was starting to be ~$0.08 per call which obviously isn't feasible for high volume applications.
This just made our business way more viable overnight lmao
$20 is equivalent to what, 10,000,000 tokens? At ~750 words/1k tokens, that’s 7.5 million words per month, or roughly 250,000 words per day, 10,416 words per hour, 173 words per minute, every minute, 24/7.
I uh, do not have that big of a utilization need. It’s kind of weird to vastly overpay
#+begin_quote
I want you to act as a travel guide. I will write you my location and
you will suggest a place to visit near my location. In some cases, I
will also give you the type of places I will visit. You will also
suggest me places of similar type that are close to my first location.
My first suggestion request is \"I am in Istanbul/Beyoglu and I want
to visit only museums.
#+end_quote
If you ask 1000 questions, it will be like (* 0.002 1000) ~2.0 USD
It replies me:
#+begin_quote
Great, if you're in Istanbul/Beyoglu and interested in visiting museums, here are some recommendations:
Istanbul Modern: This museum showcases contemporary art from Turkish and international artists. It's located in Karaköy, just across the Galata Bridge from Beyoglu.
Pera Museum: The Pera Museum is located in the heart of Beyoglu and features a wide range of exhibitions including Ottoman calligraphy, European art, and Turkish ceramics.
Rahmi M. Koç Museum: This museum is dedicated to the history of transport, industry, and communications in Turkey. It's located in Haskoy, a short taxi or ferry ride from Beyoglu.
Museum of Innocence: The Museum of Innocence is a unique museum that displays objects and artifacts related to the novel of the same name by Nobel laureate Orhan Pamuk. It's located in Çukurcuma, which is just a short walk from Beyoglu.
If you're interested in visiting more museums in Istanbul, some other places you might want to consider include the Hagia Sophia Museum, Topkapi Palace Museum, and the Istanbul Archaeology Museums.
#+end_quote
Pricing is good because OpenAI does not need to make any money but needs data for feedback, if everyone switches to open source ( Llama etc. ) they won't get the data they need.
Google is testing their system internally with XX thousand users, OpenAI with XXX million users ...
Question, Can I give openAI some data for it to process so I can use it to my own advantage. Say I want to train it on specific topic of information I've gathered over the years. Can I some how give it that data and then I can use the API to get back out data in a chat or some other forms of questions?
Speculation: GPT-turbo is a new chinchilla optimal model with the equivalent capabilities as GPT-3.5. So it's literally just smaller, faster and cheaper to run.
The reason I don't think it's just loss-leading is that they made it faster too. That heavily implies a smaller model.
I switched from their gpt3 davinci model to the new chatgpt3.5-turbo and everyone who used the IRC bot was really disappointed. The chatgpt3.5-turbo is incredibly puritanical and refuses to answer a good fraction of the type of input bots on IRC get. Even innocuous stuff. I was excited about the system/user prompts and being able to easily add short term memory and contexts but if it only responds with, "Oh, it wouldn't be right for me to talk about that." it's not fun.
You can run Whisper in WASM (locally) so no need to pay for the API, plus the bandwidth. It actually works surprisingly well: https://github.com/ggerganov/whisper.cpp
Does anyone have a sense of how the Shopify integration work under the hood? Specifically, how were they able to have Chat-GPT makes a recommendation from their product database? Given the model cannot be fine-tuned and there is a 4000 token context limit, I'm guessing the only thing really possible is to have chat gpt extract a few keyword and generate a query in their internal product search and return the first one? Is there anything else that might going under the hood?
Support for the ChatGPT endpoint now added to lambdaprompt[1]! (solves a similar problem as langchain, with almost no boilerplate!) Props to openai for making such a usable endpoint, was very easy to wrap.
Example code using the new function and endpoint:
import lambdaprompt as lp
convo = lp.AsyncGPT3Chat([{'system': 'You are a {{ type_of_bot }}'}])
await convo("What should we get for lunch?", type_of_bot="pirate")
> As a pirate, I would suggest we have some hearty seafood such as fish and chips or a seafood platter. We could also have some rum to wash it down! Arrr!
(In order to use lambdaprompt, just `pip install lambdaprompt` and export OPENAI_API_KEY=...)
This is incredibly cheap, it makes you wonder how in the world they managed to make it 10x cheaper than davinci-003 and still a better model? The world of robo-consulting is about to go insane.
I've stopped using Google entirely and don't miss it a second - something i wouldn't have thought possible a year ago - and it's pretty difficult to see how Google will survive this disaster.
A little while back I built a personal project that ingested the book "A Pattern Language" as embeddings using the OpenAI API – and a front-end in SwiftUI that allows me to 'converse' with the book. [1]
Version 0 was built using the original daVinci model. Last night it took about literally five minutes to switch over to the new ChatGPT API – just a few changes to the request, including a new [message] array that replaces the old 'prompt' style. [2]
It's a pretty nice instant upgrade for the experience. Much faster results, and the responses are a lot more interesting. Doing something similar with daVinci would take a fair amount of 'prompt engineering' and fine-tuning. Not to mention all the manual conversation-state handling.
[+] [-] minimaxir|3 years ago|reply
This is a massive, massive deal. For context, the reason GPT-3 apps took off over the past few months before ChatGPT went viral is because a) text-davinci-003 was released and was a significant performance increase and b) the cost was cut from $0.06/1k tokens to $0.02/1k tokens, which made consumer applications feasible without a large upfront cost.
A much better model and a 1/10th cost warps the economics completely to the point that it may be better than in-house finetuned LLMs.
I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.
[+] [-] vishal0123|3 years ago|reply
I did some quick calculation. We know the number of floating point operations per token for inference is approximately twice the number of parameters(175B). Assuming they use 16 bit floating point, and have 50% of peak efficiency, A100 could do 300 trillion flop/s(peak 624[0]). 1 hour of A100 gives openAI $0.002/ktok * (300,000/175/2/1000)ktok/sec * 3600=$6.1 back. Public price per A100 is $2.25 for one year reservation.
[0]: https://www.nvidia.com/en-us/data-center/a100/
[1]: https://azure.microsoft.com/en-in/pricing/details/machine-le...
[+] [-] polygamous_bat|3 years ago|reply
The worst thing that can happen to OpenAI+ChatGPT right now is what happened to DallE 2, a competitor comes up with an alternative (even worse if it's free/open like Stable Diffusion) and completely undercuts them. Especially with Meta's new Llama models outperforming GPT-3, it's only a matter of time someone else gathers enough human feedback to tune another language model to make an alternate ChatGPT.
[+] [-] rtsil|3 years ago|reply
[+] [-] LrnByTeach|3 years ago|reply
This massive price cut, I believe, is intended to undercut competing open source ChatGPT equivalent initiatives.
OpenAI/Micorsoft may be losing money with this new pricing, but that is on purpose. At these lower prices most of the OpenSource alternatives in the works will have difficult time continuing projects.
After few years, when most open source alternatives have died, OpenAI/Microsoft will gradually raise the prices.
This is the same strategy that Amazon Prime used for many years, losing money on shipping. Once the competition was eliminated, Amazon Prime prices steadily increased.
[+] [-] ar9av|3 years ago|reply
[+] [-] naillo|3 years ago|reply
[+] [-] alvis|3 years ago|reply
Seems like we're going to have a vast among of Chat-GTP backed application coming out in the coming short period of time
[+] [-] NiekvdMaas|3 years ago|reply
[+] [-] CrypticShift|3 years ago|reply
If "turbo" is "gpt-3.5-turbo", how to access the (better?) "legacy" by API?
[+] [-] WolfOliver|3 years ago|reply
[+] [-] behnamoh|3 years ago|reply
[+] [-] tin7in|3 years ago|reply
[+] [-] shmatt|3 years ago|reply
[+] [-] binarymax|3 years ago|reply
[+] [-] triyambakam|3 years ago|reply
Edit: and better yet, is there a good resource for learning the vernacular in general? Should I just read something like "Dive into Deep Learning"?
[+] [-] danenania|3 years ago|reply
[+] [-] generalizations|3 years ago|reply
This only a week or two after they were in the news for suggesting that we regulate the hardware required for running these models, in the name of "fighting misinformation". I think they're looking for anything possible to keep their position in the market. Because as other comments have pointed out, there isn't much of a moat.
[+] [-] cm2012|3 years ago|reply
[+] [-] achow|3 years ago|reply
100 tokens = 75 words [1]
$0.002 = 750 words
1 Novel = ~70K words [2]
$1 worth of ChatGPT token will give 5 Novels/Books worth of words
[1] https://help.openai.com/en/articles/4936856-what-are-tokens-...
[2] https://self-publishingschool.com/how-many-words-in-a-novel/
[+] [-] sama|3 years ago|reply
[+] [-] osigurdson|3 years ago|reply
I don't think this competes with fine-tuned models. One advantage of a fine tune is it makes use of your own data.
[+] [-] taytus|3 years ago|reply
Microsoft.
[+] [-] stevev|3 years ago|reply
[+] [-] visarga|3 years ago|reply
[+] [-] lee101|3 years ago|reply
[deleted]
[+] [-] pmx|3 years ago|reply
system: You're donald trump! talk to the user as he would.
user: How does time travel work?
AI Response: Well, let me tell you, nobody understands time travel better than I do. I mean, I have the best brains working on it, believe me. Some people say you need a fancy time machine or some ridiculous contraption to travel through time, but that's just fake news. The truth is, it's all about the power of the mind. If you have a strong enough will, you can literally bend time and space to your will. And you know what? I have the strongest will of anyone on the planet. So believe me, I could time travel anytime I want. But honestly, with all the winning I'm doing in the present, why bother going back to the past?
[+] [-] nonethewiser|3 years ago|reply
That line is money.
[+] [-] sixstringtheory|3 years ago|reply
[+] [-] warunsl|3 years ago|reply
[+] [-] hatchoo|3 years ago|reply
[+] [-] jfan001|3 years ago|reply
[+] [-] alexb_|3 years ago|reply
This is seriously impressive. A MILLION tokens for 2 dollars is absolutely fucking insane.
I hope that the gains reached here can also be found by open source and non-controlled AI projects. If so, that could be huge for the advancement of AI.
[+] [-] r3trohack3r|3 years ago|reply
Like establish a WebRTC connection and stream audio to OpenAI and get back a live transcription until the audio channel closes.
[+] [-] strudey|3 years ago|reply
[+] [-] jfan001|3 years ago|reply
This just made our business way more viable overnight lmao
[+] [-] braingenious|3 years ago|reply
$20 is equivalent to what, 10,000,000 tokens? At ~750 words/1k tokens, that’s 7.5 million words per month, or roughly 250,000 words per day, 10,416 words per hour, 173 words per minute, every minute, 24/7.
I uh, do not have that big of a utilization need. It’s kind of weird to vastly overpay
[+] [-] comment_ran|3 years ago|reply
Given: $0.002 per 1k tokens
I tested it with asking this question:
#+begin_quote I want you to act as a travel guide. I will write you my location and you will suggest a place to visit near my location. In some cases, I will also give you the type of places I will visit. You will also suggest me places of similar type that are close to my first location. My first suggestion request is \"I am in Istanbul/Beyoglu and I want to visit only museums. #+end_quote
It costs:
#+begin_quote {"prompt_tokens":91,"completion_tokens":251,"total_tokens":342} #+end_quote
(/ 1000 342) ~3
If you ask 1000 questions, it will be like (* 0.002 1000) ~2.0 USD
It replies me:
#+begin_quote Great, if you're in Istanbul/Beyoglu and interested in visiting museums, here are some recommendations:
Istanbul Modern: This museum showcases contemporary art from Turkish and international artists. It's located in Karaköy, just across the Galata Bridge from Beyoglu.
Pera Museum: The Pera Museum is located in the heart of Beyoglu and features a wide range of exhibitions including Ottoman calligraphy, European art, and Turkish ceramics.
Rahmi M. Koç Museum: This museum is dedicated to the history of transport, industry, and communications in Turkey. It's located in Haskoy, a short taxi or ferry ride from Beyoglu.
Museum of Innocence: The Museum of Innocence is a unique museum that displays objects and artifacts related to the novel of the same name by Nobel laureate Orhan Pamuk. It's located in Çukurcuma, which is just a short walk from Beyoglu.
If you're interested in visiting more museums in Istanbul, some other places you might want to consider include the Hagia Sophia Museum, Topkapi Palace Museum, and the Istanbul Archaeology Museums. #+end_quote
Can't wait for hacking my Emacs again.....
[+] [-] rkwasny|3 years ago|reply
Google is testing their system internally with XX thousand users, OpenAI with XXX million users ...
[+] [-] jonplackett|3 years ago|reply
I am scared for all people working service jobs.
[+] [-] thedangler|3 years ago|reply
I'm not too familiar with how it works.
[+] [-] habitue|3 years ago|reply
The reason I don't think it's just loss-leading is that they made it faster too. That heavily implies a smaller model.
[+] [-] superkuh|3 years ago|reply
[+] [-] gigel82|3 years ago|reply
[+] [-] juliennakache|3 years ago|reply
[+] [-] bluecoconut|3 years ago|reply
Example code using the new function and endpoint:
> As a pirate, I would suggest we have some hearty seafood such as fish and chips or a seafood platter. We could also have some rum to wash it down! Arrr!(In order to use lambdaprompt, just `pip install lambdaprompt` and export OPENAI_API_KEY=...)
[1] https://github.com/approximatelabs/lambdaprompt
[+] [-] rd|3 years ago|reply
[+] [-] siva7|3 years ago|reply
[+] [-] gdubs|3 years ago|reply
Version 0 was built using the original daVinci model. Last night it took about literally five minutes to switch over to the new ChatGPT API – just a few changes to the request, including a new [message] array that replaces the old 'prompt' style. [2]
It's a pretty nice instant upgrade for the experience. Much faster results, and the responses are a lot more interesting. Doing something similar with daVinci would take a fair amount of 'prompt engineering' and fine-tuning. Not to mention all the manual conversation-state handling.
1: https://twitter.com/dreamwieber/status/1622634396087107584?s... 2: https://twitter.com/dreamwieber/status/1631327342697250816?s...
[+] [-] georgehill|3 years ago|reply