top | item 47073784 (no title) terhechte | 10 days ago Did you try an MLX version of this model? In theory it should run a bit faster. I'm hesitant to download multiple versions though. discuss order hn newest tarruda|10 days ago Haven't tried. I'm too used to llama.cpp at this point to switch to something else. I like being able to just run a model and automatically get:- OpenAI completions endpoint- Anthropic messages endpoint- OpenAI responses endpoint- A slick looking web UIWithout having to install anything else. KerrAvon|10 days ago Is there a reliable way to run MLX models? On my M1 Max, LM Studio seems to output garbage through the API server sometimes even when the LM Studio chat with the same model is perfectly fine. llama.cpp variants generally always just work.
tarruda|10 days ago Haven't tried. I'm too used to llama.cpp at this point to switch to something else. I like being able to just run a model and automatically get:- OpenAI completions endpoint- Anthropic messages endpoint- OpenAI responses endpoint- A slick looking web UIWithout having to install anything else.
KerrAvon|10 days ago Is there a reliable way to run MLX models? On my M1 Max, LM Studio seems to output garbage through the API server sometimes even when the LM Studio chat with the same model is perfectly fine. llama.cpp variants generally always just work.
tarruda|10 days ago
- OpenAI completions endpoint
- Anthropic messages endpoint
- OpenAI responses endpoint
- A slick looking web UI
Without having to install anything else.
KerrAvon|10 days ago