top | item 47146489

(no title)

kwhkim | 6 days ago

I’d like to share my work with you: https://news.ycombinator.com/item?id=47137452

It shows that I posted just a little later than you.

I agree that the chances of something going wrong with timestamps are low, but I still think it’s worth considering as a potential security risk or injection vector — although I’m not sure how realistic that threat actually is.

Regarding UUIDs, you can get the best of both worlds: sequential IDs are convenient, and adding a small number of random characters can help avoid name collisions. Since the random component only needs to ensure uniqueness across local repositories, it doesn’t require many characters.

I can see that you’ve put a great deal of effort into this, especially with the various bridging components. I built a simple bridge myself and ran some tests using the pandas project (https://github.com/pandas-dev/pandas ), which has more than 30,000 issues. Even storing only metadata (such as title and type, excluding the body) as plain text takes more than 100 MB, which seems quite large. In comparison, storing the same data in SQL takes only about 10 MB, and packed Git objects are comparable.

So while storing issues as Git commits certainly has some benefits, I don't see much advantage beyond that. It also seems that most users would not be able to make practical use of this approach easily — for example, for batch processing — unless they are already quite comfortable working directly with Git commits.

I'm curious about what considerations led you to decide to store issues as empty Git commits. I would appreciate it if you could share your reasoning with me.

discuss

remenoscodes|6 days ago

Cool, just looked at git-pad. Same day, different data models for the same problem. Independent convergence is a good signal.

On why empty commits: this started with Linus's 2007 rant about wanting "a git for bugs." I took it literally, how far can Git's existing primitives go without introducing anything new? No files, no JSON, no database. Just commits, refs, and trailers.

The mapping: issues are append-only event logs (create -> comment -> edit -> close). Git is an append-only content-addressable store. Each commit is an event. The ref tip is current state. Trailers carry structured metadata in the same format as Signed-off-by:. Merge commits handle divergence. The entire Git toolchain works out of the box — log, rev-list,interpret-trailers, GPG signing, refspecs.

The implementation is a proof of concept though. What I really care about is ISSUE-FORMAT.md as a standalone format spec. Most of the internet runs on community-agreed specifications where the spec is the contract and implementations are details. If we have a canonical issue format, Forgejo or GitKraken or whoever can build a proper UI around it. Different implementations emerge — shell, C, Rust — until we find the optimal one. The spec is the deliverable, not the CLI.

Storage: packed Git objects are comparable to SQL for metadata. The shell won't scale to 30K issues, a C implementation with libgit2 would. That's a known limitation of v1.

Timestamps: fair concern. The format is versioned (Format-Version: 1), so logical clocks can be added in a future version without breaking existing data. For v1, LWW was the pragmatic choice — keeps the spec implementable by any tool that can read Git commits.

The bridges solved a specific problem I kept hitting: migrating projects between GitLab and GitHub or Gitea(and now azure devops) while keeping issues intact. That alone justified the effort.

Curious about git-pad's file-based approach: what happens when two contributors edit the same issue file offline and then push? Standard Git merge conflict, or do you handle it at a higher level? (Haven't had time to look at the implementation code yet)

kwhkim|6 days ago

As for the merge conflict, you resolve it the same way you would with any other file in git. I think a custom merge driver needs to be developed eventually — for example, automatically picking `type: bug or feature` instead of leaving the raw conflict markers like the following— but that's not implemented yet.

<<<<< type:bug ==== type: feature >>>>>>