top | item 37069409 (no title) junrushao1994 | 2 years ago yeah we tried out popular solutions like exllama and llama.cpp among others that support inference of 4bit quantized models discuss order hn newest No comments yet.
No comments yet.