top | item 43826563

(no title)

tandr | 10 months ago

The larger model (235b) on chat produced rather an impressive answer on a small coding task I gave it. But Qwen-30B-A3B gave a result for the same task worse than Qwen 2.5 does.

"Write a Golang program that merges huge presorted text files, just like sort -m does". Quite often models need "use heap" as guidance, but this time big model figured it out by itself.

discuss

sirnonw|10 months ago

[deleted]