turmeric_root's comments

turmeric_root | 2 years ago | on: MiniGPT-4

Windows reserves a certain percentage for VRAM for some reason. So I'd recommend Linux. Or find a way to disable the desktop/UI in Windows.

turmeric_root | 2 years ago | on: The Coming of Local LLMs

The model weights were only shared by FB to people who applied for research access. Github repos containing links to the model weights have been taken down by FB.

turmeric_root | 2 years ago | on: The Coming of Local LLMs

More VRAM => larger models. IME it is absolutely worth maxing out VRAM for the significant improvement in quality, especially with LLaMA (though even with a 4090, you won't be able to run the largest 65-billion parameter model even with 4-bit quantization).

That said, I recommend renting a cloud GPU for a few hours and trying the larger models on them before buying a GPU of your own, just to see if the models meet your requirements.

turmeric_root | 2 years ago | on: Unpredictable black boxes are terrible interfaces

A lot of the 'look what I made with AI' images that get shared around also don't include the creator's workflow. There's usually lots of trial-and-error, manual painting/inpainting, multiple models involved etc. and explaining all that is a lot harder than just saying 'I used stable diffusion'.

turmeric_root | 2 years ago | on: Revert for jart’s llama.cpp MMAP miracles

ugh, that's so shitty. so many people in this space seem to be absurdly demanding and angry at devs, but one thing I've noticed is that every text AI project discord I've hung out in has this sleazy, obsessive 4chan /g/ vibe hiding somewhere in it.

turmeric_root | 2 years ago | on: Llama.cpp 30B runs with only 6GB of RAM now

> the "number B" stands for "number of billions" of parameters... trained on?

No, it's just the size of the network (i.e. number of learnable parameters). The 13/30/65B models were each trained on ~1.4 trillion tokens of training data (each token is around half a word).

page 1