top | item 42670964

(no title)

compumetrika | 1 year ago

Ollama can pull directly from HF, you just provide the URL and add to the end :Q8_0 (or whatever) to specify your desired quant. Bonus: use the short form url of `hf` instead of `huggingface` to shorten the model name a little in the ollama list table.

Edit: so for example of you want the unsloth "debugged" version of Phi4, you would run:

`$ollama pull hf.co/unsloth/phi-4-GGUF:Q8_0`

(check on the right side of the hf.co/unsloth/phi-4-GGUF page for the available quants)

discuss

order

jimmySixDOF|1 year ago

You still need to make sure the modelfile works so this method will not run out of the box on a vision GGUF or anything with special schemas. Thats why mostly a good idea to pull from ollama directly.

wruza|1 year ago

Is it true that non-gguf models are basically all Q4-equivalent? I'm always not sure which one to download to get the "default score".