top | item 47154592

(no title)

jzapletal | 5 days ago

I did this while trying to figure out what to use in our own tool. The task was to analyze around 12,000 screenshots and find recurring manual workflows worth automating.

Results:

- Claude Sonnet 4.6: 8/10, $0.53/run — wins on quality

- Kimi K2.5: 7/10, $0.09/run — 6x cheaper, now my production pick

- GPT-5.2: 6/10, $0.41/run — missed the most obvious patterns, odd

- DeepSeek V3.2: 0/10 — gave me a garbled XML...

Models that flagged a one-time DKIM setup as "recurring automation candidate" got penalized.

Happy to share more if folks find this interesting.

discuss

order

No comments yet.