Ship_Star_1010's comments

Ship_Star_1010 | 1 year ago | on: Run Llama locally with only PyTorch on CPU

PyTorch has a native llm solution It supports all the LLama models. It supports CPU, MPS and CUDA https://github.com/pytorch/torchchat Getting 4.5 tokens a second using 3.1 8B full precision using CPU only on my M1