(no title)
dajonker | 1 month ago
Codex is notably higher quality but also has me waiting forever. Hopefully these small models get better and better, not just at benchmarks.
dajonker | 1 month ago
Codex is notably higher quality but also has me waiting forever. Hopefully these small models get better and better, not just at benchmarks.
dajonker|1 month ago
- Tool calling doesn't work properly with OpenCode
- It repeats itself very quickly. This is addressed in the Unsloth guide and can be "fixed" by setting --dry-multiplier to 1.1 or higher
- It makes a lot of spelling errors such as replacing class/file name characters with "1". Or when I asked it to check AGENTS.md it tried to open AGANTS.md
I tried both the Q4_K_XL and Q5_K_XL quantizations and they both suffer from these issues.
eblanshey|1 month ago
> Jan 21 update: llama.cpp fixed a bug that caused looping and poor outputs. We updated the GGUFs - please re-download the model for much better outputs.
latchkey|1 month ago
This user has also done a bunch of good quants:
https://huggingface.co/0xSero
WanderPanda|1 month ago
dajonker|1 month ago
The flash model in this thread is more than 10x smaller (30B).
behnamoh|1 month ago
And while it usually leads to higher quality output, sometimes it doesn't, and I'm left with a bs AI slop that would have taken Opus just a couple of minutes to generate anyway.