top | item 45312184

(no title)

swaptr | 5 months ago

AI-generated code can be useful in the early stages of a project, but it raises concerns in mature ones. Recently, a 280kloc+ Postgres parser was merged into Multigres (https://github.com/multigres/multigres/pull/109) with no public code review. In open source, this is worrying. Many people rely on these projects for learning and reference. Without proper review, AI-generated code weakens their value as teaching tools, and more importantly the trust in pulling as dependencies. Code review isn’t just about bugs, it’s how contributors learn, understand design choices, and build shared knowledge. The issue isn’t speed of building software (although corporations may seem to disagree), but how knowledge is passed on.

Edit: Reference to the time it took to open the PR: https://www.linkedin.com/posts/sougou_the-largest-multigres-...

discuss

sougou|5 months ago

I oversaw this work, and I'm open to feedback on how things can be improved. There are some factors that make this particular situation different:

This was an LLM assisted translation of the C parser from Postgres, not something from the ground up.

For work of this magnitude, you cannot review line by line. The only thing we could do was to establish a process to ensure correctness.

We did control the process carefully. It was a daily toil. This is why it took two months.

We've ported most of the tests from Postgres. Enough to be confident that it works correctly.

Also, we are in the early stages for Multigres. We intend to do more bulk copies and bulk translations like this from other projects, especially Vitess. We'll incorporate any possible improvements here.

The author is working on a blog post explaining the entire process and its pitfalls. Please be on the lookout.

I was personally amazed at how much we could achieve using LLM. Of course, this wouldn't have been possible without a certain level of skill. This person exceeds all expectations listed here: https://github.com/multigres/multigres/discussions/78.

wg002|5 months ago

"We intend to do more bulk copies and bulk translations like this from other projects"

Supabase’s playbook is to replicate existing products and open source projects, release them under open source, and monetize the adoption. They’ve repeated this approach across multiple offerings. With AI, the replication process becomes even faster, though it risks producing low-quality imitations that alienate the broader community and people will resent the stealing of their work.