(no title)
josu | 1 month ago
I don't think you've seen the full potential. I'm currently #1 on 5 different very complex computer engineering problems, and I can't even write a "hello world" in rust or cpp. You no longer need to know how to write code, you just need to understand the task at a high level and nudge the agents in the right direction. The game has changed.
- https://highload.fun/tasks/3/leaderboard
- https://highload.fun/tasks/12/leaderboard
- https://highload.fun/tasks/15/leaderboard
johndough|1 month ago
josu|1 month ago
But I don't want to spoil the fun. The agents are really good at searching the web now, so posting the tricks here is basically breaking the challenge.
For example, chatGPT was able to find Matt's blog post regarding Task 1, and that's what gave me the largest jump: https://blog.mattstuchlik.com/2024/07/12/summing-integers-fa...
Interestingly, it seems that Matt's post is not on the training data of any of the major LLMs.
zarzavat|1 month ago
Yes, LLMs are very good at writing code, they are so good at writing code that they often generate reams of unmaintainable spaghetti.
When you submit to an informatics contest you don't have paying customers who depend on your code working every day. You can just throw away yesterday's code and start afresh.
Claude is very useful but it's not yet anywhere near as good as a human software developer. Like an excitable puppy it needs to be kept on a short leash.
josu|1 month ago
I used highload as an example because it seems like an objective rebuttal to the claim that "but it can't tackle those complex problems by itself."
And regarding this:
"Claude is very useful but it's not yet anywhere near as good as a human software developer. Like an excitable puppy it needs to be kept on a short leash"
Again, a combination of LLM/agents with some guidance (from someone with no prior experience in this type of high performing architecture) was able to beat all human software developers that have taken these challenges.
VMG|1 month ago
The skill of "a human software developer" is in fact a very wide distribution, and your statement is true for a ever shrinking tail end of that
FeepingCreature|1 month ago
The ultimate test of all software is "run it and see if it's useful for you." You do not need to be a programmer at all to be qualified to test this.
throw1235435|1 month ago
I'm not sure what this means for the future of SWE's though yet. I don't see higher levels of staff in big large businesses bothering to do this, and at some scale I don't see founders still wanting to manage all of these agents, and processes (got better things to do at higher levels). But I do see the barrier of learning to code gone; meaning it probably becomes just like any other job.
ModernMech|1 month ago
josu|1 month ago
dajoh|1 month ago
Ah yes, well known very complex computer engineering problems such as:
* Parsing JSON objects, summing a single field
* Matrix multiplication
* Parsing and evaluating integer basic arithmetic expressions
And you're telling me all you needed to do to get the best solution in the world to these problems was talk to an LLM?
josu|1 month ago
If you think you can beat an LLM, the leaderboard is right there.