top | item 35151567

(no title)

tottenval | 3 years ago

OpenAI hasn't published any information about the size or hardware requirements for running ChatGPT. Reading between the lines, the default ChatGPT Turbo model seems to be significantly smaller than GPT-3 (it's a distilled model), but probably still heavier than the Alpaca and Llama 7B models people are running (very slowly) on their single GPU computers this week. You'd probably need multiple A100s to get comparable performance to the ChatGPT API.

discuss

noduerme|3 years ago

Does the llama code that dropped leverage the GPU at all? On an M1 it appears to just run on as many CPU cores as you want to throw at it. The 65B heats up 8 cores real nicely, and it's slow, but I imagine it would be a lot faster on the GPU.

Tostino|3 years ago

I've seen people saying that limiting it to 4 cores out of the 8 total can actually lead to improved performance. Have you seen that?

brianjking|3 years ago

All of the llama implementations for Apple are CPU only afaik.