top | item 46992768 (no title) kachapopopow | 17 days ago well yah, that's what I mean how better is it versus cat + grep + manual line counting. Agents tend to perform worse with niche tools discuss order hn newest jahala|15 days ago It was really helpful to make and run a benchmark - it led to some important changes and improvements, so thanks again for your question kp!The result is ~17% reduction in raw cost. If calculated per correct answer, its ~25% reduction per correct answer.Just posted the update -> https://news.ycombinator.com/item?id=47016959 jahala|17 days ago Thank you for this question - I'm building out a benchmark now. Initial results are very promising, will update you once it's done!
jahala|15 days ago It was really helpful to make and run a benchmark - it led to some important changes and improvements, so thanks again for your question kp!The result is ~17% reduction in raw cost. If calculated per correct answer, its ~25% reduction per correct answer.Just posted the update -> https://news.ycombinator.com/item?id=47016959
jahala|17 days ago Thank you for this question - I'm building out a benchmark now. Initial results are very promising, will update you once it's done!
jahala|15 days ago
The result is ~17% reduction in raw cost. If calculated per correct answer, its ~25% reduction per correct answer.
Just posted the update -> https://news.ycombinator.com/item?id=47016959
jahala|17 days ago