top | item 37071655

(no title)

PenguinRevolver | 2 years ago

I feel as if the cheapest way of running these kinds of models would be to have the whole cache/memory take space on the hard drive rather than the RAM. Then, you could just use CPU power instead of splurging out thousands for RAM & a GPU with enough VRAM.

It might or might not be reasonable speeds, but I would reason that it could avoid "sunk cost irony"; if you decide, that any point, Chat-GPT would have sufficed in your task. It's rare, but it can happen.

If you want to take this silly logic further, you can theoretically run any sized model on any computer. You could even attempt this dumb idea on a computer running Windows 95. I don't care how long it would take; if it takes seven and a half million years for 42 tokens, I would still call it a success!

discuss

order

pocketarc|2 years ago

You are right about that being the cheapest, of course, in the sense that 64gb of HDD space is always going to be cheaper than RAM. But when you say

> thousands for RAM

I wonder if your perspective might be a little off - you can get 64GB DDR4 RAM for ~$100, it’s really not a big deal these days.

It’s a big deal on Mac, of course, where 64GB means big kitted out high-end model that costs thousands, but RAM really is that cheap.

PenguinRevolver|2 years ago

Understandable; the reason I said "thousands for RAM" was because when I made that sentence, I put the theoretical RAM and GPU prices together. Oh well.