top | item 47214624

(no title)

jumploops | 10 hours ago

I've been experimenting with a few ways to keep the "historical context" of the codebase relevant to future agent sessions.

First, I tried using simple inline comments, but the agents happily (and silently) removed them, even when prompted not to.

The next attempt was to have a parallel markdown file for every code file. This worked OK, but suffered from a few issues:

1. Understanding context beyond the current session

2. Tracking related files/invocations

3. Cold start problem on an existing codebases

To solve 1 and 3, I built a simple "doc agent" that does a poor man's tree traversal of the codebase, noting any unknowns/TODOs, and running until "done."

To solve 2, I explored using the AST directly, but this made the human aspect of the codebase even less pronounced (not to mention a variety of complex edge-cases), and I found the "doc agent" approach good enough for outlining related files/uses.

To improve the "doc agent" cold start flow, I also added a folder level spec/markdown file, which in retrospect seems obvious.

The main benefit of this system, is that when the agent is working, it not only has to change the source code, but it has to reckon with the explanation/rationale behind said source code. I haven't done any rigorous testing, but in my anecdotal experience, the models make fewer mistakes and cause less regressions overall.

I'm currently toying around with a more formal way to mark something as a human decision vs. an agent decision (i.e. this is very important vs. this was just the path of least resistance), however the current approach seems to work well enough.

If anyone is curious what this looks like, I ran the cold start on OpenAI's Codex repo[0].

[0]https://github.com/jumploops/codex/blob/file-specs/codex-rs/...

discuss

No comments yet.