top | item 35575731

(no title)

iliane5 | 2 years ago

I bet they’re not saying how big of a model GPT-4 is because it’s actually much smaller we would expect.

ChatGPT is IMO a heavily fine-tuned Curie sized model (same price via API + less cognitive capacity than even text davinci-003) so it would make sense that a heavily fine-tuned Davinci sized model would yield similar results to GPT-4.

discuss

kyle_grove|2 years ago

Yannic Kilcher makes a similar supposition based on results from the tech report https://www.youtube.com/watch?v=2zW33LfffPc&pp=ygUOeWFubmljI... . It’s about 3/4 of the way through the video if memory serves.

KeplerBoy|2 years ago

I wouldn't bet on their pricing being indicative of their costs. If MSFT wants the ChatGPT-API to be a success and is willing to subsidize it, that's just how it is.

iliane5|2 years ago

It’s not only 10x cheaper, it’s also way faster at inference and not as smart as Davinci. IMO the only logical answer is that the model is just smaller.

qumpis|2 years ago

I wonder why it's slower at inference time then (for members using their web UI), or rather, if it's similar in size to gpt3, how gpt3 is optimized in a way that gpt4 isn't or can't be?

I'd expect that by now we would enjoy similar speeds but this hasn't yet happened.

MacsHeadroom|2 years ago

GPT-4 is the same speed as legacy GPT-3 ChatGPT for me. It's only occasionally slower, which I expect is due to load and not it being larger.