top | item 44600644

(no title)

ikt | 7 months ago

over time you would assume the models will get more efficient and the hardware will get better to the point that buying a massive new gpu with boatloads of vram is just not necessary

maybe 128gb of vram becomes the new mid tier model and most llms can fit into this nicely and do everything one wants in an llm

given how fast llms are progressing it wouldn’t surprise me if we reach this point by 2030

discuss

oxcidized|7 months ago

Considering there were two generations (around 4.5 years) of top-tier consumer GPUs (3090/4090) stuck at 24GB VRAM max, and the current one (5090) "only" bumped it up to 32GB, I think you'll be waiting more than 5 years before 128GB VRAM comes to the mid tier model GPU. 12-16GB is currently mid tier and has been since LLMs became "a thing".

I hope I'm wrong though, and we see a large bump soon. Even just 32GB in the mid tier would be huge.

I'm really tempted to try out a Mac Studio with 256+ GB Unified Memory (192 GB VRAM), but it is sadly out of my budget at the moment. I know there is a bandwidth loss, but being able to run huge models and huge contexts locally would be quite nice.