yobanate | 2 years ago | on: Many options for running Mistral models in your terminal using LLM
Can confirm.
My M3 Max gets about 22t/s, putting the bottleneck BKAC.
yobanate | 2 years ago | on: Many options for running Mistral models in your terminal using LLM
yobanate | 2 years ago | on: MLX: An array framework for Apple Silicon