top | item 47142335

(no title)

After GLM and Z.ai releasing huge models. Thanks to Qwen team, we have models which could be run on low end devices.

Especially that Qwen3.5-35B-A3 looks great for cheaper GPUs. Since a quant version of it would need a <32 GB RAM.

discuss

It's not just Qwen; we also recently had GLM-4.7-Flash in the same roughly 30B-A3 range. Seems to me like there's no shortage of competition for good old GPT-OSS 20B (not just Qwen3.5-35B and GLM-4.7-Flash, but also Qwen3(-Coder)-30B or Granite 4 Small).