Show HN: Detail, a Bug Finder
67 points| drob | 2 months ago |detail.dev
Long story below.
--------------------------
We originally set out to work on technical debt. We had all seen codebases with a lot of debt, so we had personal grudges about the problem, and AI seemed to be making it a lot worse.
Tech debt also seemed like a great problem for AI because: 1) a small portion of the work is thinky and strategic, and then the bulk of the execution is pretty mechanical, and 2) when you're solving technical debt, you're usually trying to preserve existing behavior, just change the implementation. That means you can treat it as a closed-loop problem if you figure out good ways to detect unintended behavior changes due to a code change. And we know how to do that – that's what tests are for!
So we started with writing tests. Tests create the guardrails that make future code changes safer. Our thinking was: if we can test well enough, we can automate a lot of other tech debt work at very high quality.
We built an agent that could write thousands of new tests for a typical codebase, most "merge-quality". Some early users merged hundreds of PRs generated this way, but intuitively the tool always felt "good but not great". We used it sporadically ourselves, and it usually felt like a chore.
Around this point we realized: while we had set out to write good tests, we had built a system that, with a few tweaks, might be very good at finding bugs. When we tested it out on some friends' codebases, we discovered that almost every repo has tons of bugs lurking in it that we were able to flag. Serious bugs, interesting enough that people dropped what they were doing to fix them. Sitting right there in peoples codebases, already merged, running in prod.
We also found a lot of vulns, even in mature codebases, and sometimes even right after someone had gotten a pentest.
Under the hood: - We check out a codebase and figure out how to build it for local dev and exercise it with tests. - We take snapshots of the built local dev state. (We use Runloop for this and are big fans.) - We spin up hundreds of copies of the local dev environment to exercise the codebase in thousands of ways and flag behaviors that seem wrong. - We pick the most salient, scary examples and deliver them as linear tickets, github issues, or emails.
In practice, it's working pretty well. We've been able to find bugs in everything from compilers to trading platforms (even in rust code), but the sweet spot is app backends.
Our approach trades compute for quality. Our codebase scans take hours, far beyond what would be practical for a code review bot. But the result is that we can make more judicious use of engineers’ attention, and we think that’s going to be the most important variable.
Longer term, we think compute is cheap, engineer attention is expensive. Wielded properly, the newest models can execute complicated changes, even in large codebases. That means the limiting reagent in building software is human attention. It still takes time and focus for an engineer to ingest information, e.g. existing code, organizational context, and product requirements. These are all necessary before an engineer can articulate what they want in precise terms and do a competent job reviewing the resulting diff.
For now we're finding bugs, but the techniques we're developing extend to a lot of other background, semi-proactive work to improve codebases.
Try it out and tell us what you think. Free first scan, no credit card required: https://detail.dev/
We're also scanning on OSS repos, if you have any requests. The system is pretty high signal-to-noise, but we don't want to risk annoying maintainers by automatically opening issues, so if you request a scan for an OSS repo the results will go to you personally. https://detail.dev/oss
munchler|2 months ago
It would make a lot more sense to me if you provided a lighter "intro" version, even if that means it can only run on public repos.
drob|2 months ago
bflesch|2 months ago
Also unfortunately the animation on the landing page makes the whole website quite slow.
drob|2 months ago
I'm the founder. Previously I was at Heap for nine years. There's a company LinkedIn with the rest of the team: https://www.linkedin.com/company/detail-dev/
We're located in SF. The About Us page lists some of our angel investors at the bottom.
Regarding security in particular, there's a lot more info in our Trust Center: https://trust.detail.dev/
If anything else seems conspicuously missing, please flag. In all likelihood it's omitted without intent.
howinator|2 months ago
Waxing philosophical a bit, I think tools like these are going to be super helpful as our collective understanding of the codebases we own decreases over time due to the proliferation of AI generated code. I'm not making a value judgement here, just pointing out that as we understand codebases less, tools that help us track down the root causes of bugs will be more important.
solatic|2 months ago
Please consider a pricing model that's closer to bug bounties. There's clearly a working pricing model where companies are willing to pay bounties for discovered vulnerabilities. Your tool finds vulnerabilities (among other classes of bugs). Why not a pricing model where customers agree up-front to pay per bug your model finds? There are definitely some tricky parts to that model - you need an automated way of grading/scoring the bugs you find, since critical-severity bugs will be worth more (and be more interesting to customers) compared to low-severity bugs, and some customers will surely appeal some of the automatic scores - but could you make it work? Customers could then have more control over scaling up usage of Detail (adding slowly to more repositories), including capping how many bugs of each severity they would like reports for (to limit their spend), allowing customers to slowly add more repositories and run scans more frequently to find more bugs as they get more proven value from the tool.
drob|2 months ago
valmarelox|2 months ago
eikenberry|2 months ago
sbruchmann|2 months ago
https://app.detail.dev/onboarding
drob|2 months ago
hiesenbug|2 months ago
drob|2 months ago
We should be able to handle cross-compilation. Want to try it? Ping me in any direct channel (dan@detail.dev / @danlovesproofs) and we can keep an eye on your repo.
chrsw|2 months ago
drob|2 months ago
We should be able to find something interesting in most codebases, as long as there's some plausible way to build and test the code and the codebase is big enough. (Below ~250 files the results get iffy.) We've just tested it a lot more thoroughly on app backends, because that's what we know best.
StrangeSound|2 months ago
drob|2 months ago
cloudhead|2 months ago
ZeroConcerns|2 months ago
drob|2 months ago
We support java, c/c++, kotlin, ruby, and swift as well. Did you have something specific in mind?
suprnurd|2 months ago
dbworku|2 months ago