top | item 45714781

(no title)

veber-alex | 4 months ago

The llama.cpp issues are strange.

There are official benchmarks of the Spark running multiple models just fine on llama.cpp

https://github.com/ggml-org/llama.cpp/discussions/16578

discuss

order

CaptainOfCoit|4 months ago

There wasn't any instructions how the author got ollama/llama.cpp, could possibly be something nvidia shipped with the DGX Spark and is an old version?

moffkalast|4 months ago

Llama.cpp main branch doesn't run on Orins so it's actually weird that it does run on the Spark.

RyeCatcher|4 months ago

Cool I’ll have a look. All reflections I made were first pass stuff.