top | item 47029673

(no title)

phamilton | 14 days ago

As an experiment, I set it up with a z.ai $3/month subscription and told it to do a tedious technical task. I said to stay busy and that I expect no more than 30 minutes of inactivity, ever.

The task is to decompile Wave Race 64 and integrate with libultraship and eventually produce a runnable native port of the game. (Same approach as the Zelda OoT port Ship of Harkinian).

It set up a timer ever 30 minutes to check in on itself and see if it gave up. It reviews progress every 4 hours and revisits prioritization. I hadn't checked on it in days and when I looked today it was still going, a few functions at a time.

It set up those times itself and creates new ones as needed.

It's not any one particular thing that is novel, but it's just more independent because of all the little bits.

discuss

order

hobofan|14 days ago

So, you don't know if it has produced anything valuable yet?

beaker52|13 days ago

It's the same story with these people running 12 parallel agents that automatically implement issues managed in Linear by an AI product team that has conducted automated market and user research.

Instead of making things, people are making things that appear busy making things. And as you point out, "but to what end?" is a really important question, often unanswered.

"It's the future, you're going to be left behind", is a common cry. The trouble is, I'm not sure I've seen anything compelling come back from that direction yet, so I'm not sure I've really been left behind at all. I'm quite happy standing where I am.

And the moment I do see something compelling come from that direction, I'll be sure to catch up, using the energy I haven't spent beating down the brush. In the meantime, I'll keep an eye on the other directions too.

decidu0us9034|13 days ago

Yeah I'm not sure I understand what the goal here is. Ship of Harkinian is a rewrite not just a decompilation. As a human reverse engineer I've gotten a lot of false positives.This seems like one of those areas where hallucinations could be really insidious and hard to identify, especially for a non-expert. I've found MCP to be helpful with a lot of drudgery, but I think you would have to review the llm output, do extensive debugging/dynamic analysis, triage all potential false positives, before attempting to embark on a rewrite based on decompiled assembly... I think OoT took a team of experts collectively thousands of person-hours to fully document, it seems a bit too hopeful to want that and a rewrite just from being pushy to an agent...

phamilton|13 days ago

Not yet. But what's the actual goal here? It's not to have a native Wave Race 64. It's to improve my intuition around what sort of tasks can be worked on 24/7 without supervision.

I have a hypothesis that I can verify the result against the original ROM. With that as the goal, I believe the agent can continue to grind on the problem until it passes that verification. I've seen it in that of other areas, but this is something larger and more tedious and I wanted to see how far it could go.

Aperocky|13 days ago

That sound like being a manager IRL.

hirako2000|13 days ago

$3 z.ai subscription? Sounds like it already burned $3k

I find those toys in perfect alignment with what LLM provider thrive for. Widespread token consumption explosion to demonstrate investors: see, we told you we were right to invest, let's open other giga factories.

phamilton|13 days ago

It's using about 100M input tokens a day on glm 4.7 (glm 5 isn't available on my plan). It's sticking pretty close to the throttling limits that reset every 5 hours.

100M input tokens is $40 and anywhere from 2-6 kWh.

Certainly excessive for my $3/month.

OJFord|13 days ago

How's it burned $3k on a $3/month subscription running for a few days?

thinkingtoilet|13 days ago

What a great use of humanity's adn the earth's resources.

someperson|14 days ago

Keep us posted, this sounds great!