top | item 46681791 (no title) a_e_k | 1 month ago When the Unsloth quant of the flash model does appear, it should show up as unsloth/... on this page:https://huggingface.co/models?other=base_model:quantized:zai...Probably as:https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF discuss order hn newest homarp|1 month ago it'a a new architecture. Not yet implemented in llama.cppissue to follow: https://github.com/ggml-org/llama.cpp/issues/18931 dumbmrblah|1 month ago One thing to consider is that this version is a new architecture, so it’ll take time for Llama CPP to get updated. Similar to how it was with Qwen Next. cristoperb|1 month ago Apparently it is the same as the DeepseekV3 architecture and already supported by llama.cpp once the new name is added. Here's the PR: https://github.com/ggml-org/llama.cpp/pull/18936 load replies (1)
homarp|1 month ago it'a a new architecture. Not yet implemented in llama.cppissue to follow: https://github.com/ggml-org/llama.cpp/issues/18931
dumbmrblah|1 month ago One thing to consider is that this version is a new architecture, so it’ll take time for Llama CPP to get updated. Similar to how it was with Qwen Next. cristoperb|1 month ago Apparently it is the same as the DeepseekV3 architecture and already supported by llama.cpp once the new name is added. Here's the PR: https://github.com/ggml-org/llama.cpp/pull/18936 load replies (1)
cristoperb|1 month ago Apparently it is the same as the DeepseekV3 architecture and already supported by llama.cpp once the new name is added. Here's the PR: https://github.com/ggml-org/llama.cpp/pull/18936 load replies (1)
homarp|1 month ago
issue to follow: https://github.com/ggml-org/llama.cpp/issues/18931
dumbmrblah|1 month ago
cristoperb|1 month ago