Pijul – The Mathematically Sound Version Control System Written in Rust

[+] oxinabox|5 years ago|reply

I really want to use Pijul, but programming is a team sport. And especially open source.

The Pijul equivelent of GitHub is Nest. It's just not there yet. Firstly, there are basically no potential colaborators for the areas i work on. Secondly, it is much less mature ass a platform. In 5 minutes i couldn't find any way to "browse repositories" inn any sense, to discover what is out there. Discussions are not fleshed out as a issue tracker (no tags for a start), though that could be dealt with by using a 3rd party issue tracker.

Pijul's whole setup seems like it would solve a bunch of problems I have with git. 1. that a commit maintains its own identity when cherrypicked onto another branch, and its intrinstically linked. Vs in Git where it is a unrelated (but identical) set of changed. 2. the whole merge/rebase dichotomy. In git most people try annd avoid merging the main branch into the feature branch, because it leads to messy history and hard to understand where changes come from, and prefer to instead do rebase. But for a merge you only need to resolve conflicts once. For a rebase you need to resolve them potentially again and again even if those conflicts never would matter to the final version (e.g. if the file that is conflicting has been deleted before HEAD) But Pijul doesn't have seperate merge and rebase because changed in Pijul commute. Which is a mind twise of an idea. But they say they solved it, so nice.

[+] pmeunier|5 years ago|reply

> Firstly, there are basically no potential colaborators for the areas i work on

That's correct. On the other hand, this is a network effect, and the Nest, in its current incarnation, is only 3 weeks old. It's getting better every day, though.

The two main features I'm working on at the moment are:

- CI/CD. This is almost ready to go, it is "sort of" working. One issue that I fixed recently is that Pijul versions are tricky to compute: change A, followed by B, is the same state (guaranteed by design) as B, followed by A. Our version identifiers handle that now (try `pijul log --state`).

- Social features, like discoverability, recommendation of collaborations. We're really just getting started at this, and figuring out what works best.

I should add that until two weeks ago, the Nest would stop accepting connections at random times, or crash. It's much more stable now, and we can finally think of adding features.

> Secondly, it is much less mature ass a platform

There are tags for issues. You can configure them in the "Admin" page of your repositories. They're not used very much for Pijul itself, because most issues so far have been closed within minutes or hours after their creation.

> Which is a mind twise of an idea. But they say they solved it, so nice.

As one of the authors, I can tell you it wasn't easy to get everything to work together, especially for conflicts. But the theory is clear and sound now (and I have good confidence that the implementation is, too): any two patches that could have been created independently can be pushed to a remote repository independently from each other.

[+] auggierose|5 years ago|reply

Programming can be a team sport, but doesn't have to be. It doesn't even have to be a sport.

[+] wocram|5 years ago|reply

You can use git rerere to avoid the repeated rebase conflicts.

[+] gnufx|5 years ago|reply

Pijul would presumably fit well with the development style encouraged by Sourcehut, if you need such a thing. I've never had time to consider integrating Darcs with it.

[+] arka2147483647|5 years ago|reply

It would be nice if the authors could provide practical examples of their features, instead of a theoretical talk about mathematical soundness.

Ie; how does this make my life, as a programmer, easier?

[+] gjulianm|5 years ago|reply

As far as I understand, the main change with respect to Git is that commit hashes don't change depending on previous history, so if you rebase the same change IDs are retained.

Personally I've never had an issue with conflicts after cherry-picking, if Git sees the same change in two branches it manages it correctly most of the time. Furthermore, I like that commit ID changes when previous history is changed: a commit ID refers to a certain state of the repository. I don't like that two repositories in different states could have the same commit ID.

[+] rstarast|5 years ago|reply

The one potential effect of this that I'd love over git is being able to check in local changes that I never intend to upstream. E.g. I could have a local change adding some editor config files, or modifying some compiler flags in the Makefile to suit my local config.

With git my typical solution for this is to basically keep rebasing them as uncommitted changes with `git stash`, until I inevitably lose them. Or keep a separate branch, and keep explicitly rebasing (and stay on my toes not to accidentally push the commits).

[+] rcthompson|5 years ago|reply

Not having to worry about the order of patches would (except where they explicitly depend on each other) is a really nice property to have. One consequence of this property, alluded to in the article, is that if I understand correctly, rebasing and merging effectively become the same operation. In Git, the difference between rebasing and merging is the resulting dependency graph of the commits. However, in Pijul, there is no dependency graph, at least not based on the order in which the commits are applied, so rebase and merge are equivalent.

For example, have you ever been doing code-review on a Github pull request and made an inline comment on a line of code, only to have that comment later invalidated by a rebase? That happens because the commit hashes were changed by the rebase, so the commit hash referenced by that comment is no longer contained in the pull request. There's probably a new commit with a new commit hash representing the same change, but there doesn't have to be. So Github can no longer be certain of where to place that comment, and it can't tell you whether that comment is still relevant to the pull request in its current state. (IIRC, Github continues to show such comments but marks them as associated with commits that are no longer in the branch, which makes it hard to know if they're still applicable.)

In Pijul, I'm pretty sure this problem simply doesn't exist. A hypothetical "Pijulhub" that implemented the same "comment on changes" feature could certainly have comments on old changes that subsequently get removed from the pull request, but if that happens, then "Pijulhub" can confidently declare that those comments are no longer relevant to the current state. On the other hand, if some version of the same change is still present in the current state of the pull request, then "Pijulhub" can confidently continue to include that comment, still associated with that same change, because even after modifying the pull request, that change still has the same hash, because it still represents the exact same change.

[+] zeotroph|5 years ago|reply

I'd especially be interested in examples regarding multiple channels / branches (one development and several production branches), and then backporting a fix into older versions.

In the conflict free use case I expect to just notify Pijul that this fix now belongs to not only to the dev-master, but also and rel.22, rel.21 etc. - in contrast to switching to every release branch and cherry-picking it with git.

How is the workflow if this applies cleanly to rel.22, but not rel.21 anymore? What if the same conflict happens in rel.20, rel.19 and rel.18?

If I (in git speak) "--amend" the original fix, could I automatically include this in the release branches?

What if the conflict resolution changes every single line, but I still want to record that, "Yes, this fix is also in rel.17, even though it does not look like it".

[+] junon|5 years ago|reply

Just reading the "abstract", this sounds way more complicated than anyone needs a VCS to be. I also question the "mathematically sound" bit - the "commutative changes/diff" thing sounds really far fetched.

Also, the fact it is written in rust means nothing to me. The whole "but rust doesn't segfault!" thing is, and has always been, a ridiculous reason why rust is somehow inherently better suited for such projects. The article keeps leaning on that fact as though it makes the project superior somehow, but it just feels like grandstanding.

The architecture section is just buzzwords. It doesn't explain the technical aspects, it doesn't allow me to understand the system, and then I'm immediately dropped into "how to use the CLI". But wait, you haven't sold me yet.

This was painful to read.

[+] withtypes|5 years ago|reply

I don't read it like that at all. Rust is mentioned only once. Nevertheless I do care if a tool I am considering to use is written in JavaScript, Rust or some esoteric language. It means a lot, not only for "safety" which is not important to you that much, but also for maintainability, community, and the future of the project. Of course, what triggers you probably is mentioning Rust in the title of the post. Still hard to imagine being so annoyed by it–if authors thinks it's important or it will grab attention, why not use it...

[+] pmeunier|5 years ago|reply

We might not have read the same article.

> Also, the fact it is written in rust means nothing to me.

I'm one of the authors of Pijul, and the author of that post asked me a few questions when preparing that post, and that is addressed there: https://initialcommit.com/blog/pijul-creator (see "Why did you choose to write Pijul in Rust?").

> It doesn't explain the technical aspects, it doesn't allow me to understand the system

You can have plenty of that there: https://pijul.org/manual/theory.html.

> This was painful to read.

I strongly disagree. Explaining the goals of this project isn't always easy, because both large complex projects and complete beginners can benefit from using a rigorous mathematical modeling. I guess this blog post leans more towards the "beginners" side, whereas the "Theory" page linked above would be more appealing to power users. Both are useful and important, they just appeal to different people.

[+] nilkn|5 years ago|reply

The Rust part I think is meaningful. It means we can (with reasonable probability) expect the software to be fast and lightweight in terms of memory usage, and it means it’s probably a lot easier for new contributors to jump in than an equivalent C or C++ codebase. That last bit is about a lot more than memory safety. Rust is basically designed from the ground up to make it relatively easy to onboard a new engineer safely onto a project in comparison to legacy systems languages.

[+] mhh__|5 years ago|reply

Honestly as long as it's not written in JavaScript (I like not having npm and node on my machine) or C (not abstract enough) I'm happy to use it. For small tool at least.

[+] whateveracct|5 years ago|reply

> Just reading the "abstract", this sounds way more complicated than anyone needs a VCS to be. I also question the "mathematically sound" bit - the "commutative changes/diff" thing sounds really far fetched.

Care to say anything more quantitative and less feel-y? Because you're refuting pretty much the entire premise of pijul (which clearly exists, works, and has been somewhat rigorously designed) with nothing but adjectives. So having a concrete counterpoint would be helpful because pijul is just sitting there with very concrete counterpoints to your entire comment :)

[+] jedisct1|5 years ago|reply

Yes, but it is written in Rust.

[+] kadoban|5 years ago|reply

Is there any hope in Pijul of _any_ kind of interoperability with git at all?

Back when there used to be some projects still stuck on svn that I wanted to work on, I was able to use git locally and just kind of "publish" via svn when I was done.

Would anything like that be possible? That'd be the killer thing for me to be able to give Pijul a real try.

[+] pmeunier|5 years ago|reply

Yes! The `pijul git` command allows you to import Pijul repositories by replaying their history. It also works incrementally, meaning that if you already imported a repository, you'll be able to continue new commits.

This uses a somewhat naive way of doing things, and can take a very long time on large repositories. One other way of doing it would be to add an "initial change" explicitly saying "I'm coming from Git commit number #SHA1HASH".

[+] gnufx|5 years ago|reply

For what it's worth, you can do that with Darcs import/export. I've kept parallel git and hg repos synced that way with a cron job; it could be done with a hook, but that slowed down pushes too much in my case. (Doing that basically assumes one branch, i.e. linear development in Darcs.)

[+] onetimertwo|5 years ago|reply

This.

[+] nieve|5 years ago|reply

With this latest release does Pijul now have an extensive test harness to prevent the corruption & data loss in repositories that people have been experiencing for years? I know pre-1.0 versions had low visibility warnings that you shouldn't actually use Pijul for any code you cared about, but I'd assume that is no longer true for a 1.0 version. I've been very interested as an on-again off-again Darcs user, but when people were reporting that it destroyed their data it was a kind of scary step to take.

[+] pmeunier|5 years ago|reply

> low visibility warnings

The versions were called 0.x, each blog post said "this is experimental", and there was even a blinking line on the front page of nest.pijul.com. Maybe we should have printed a notice on the command line tool itself ;-)

More seriously, one major issue with the 0.x versions was that bad performance meant we couldn't really test it massively. But now we can, and actually one of my first tests was to try and import the history of Nixpkgs, and run massive checks for data loss after every operation. It works now.

[+] gnufx|5 years ago|reply

I asked before, but didn't get an answer. Darcs is criticized for lack of soundness and potential exponential merge complexity -- which I'm not sure I've actually run into -- but that's Darcs2. I haven't actually followed along, but I thought the development Darcs3 is an answer. This may not be the place to ask, but can anyone compare the new Pijul and Darcs3?

[+] notmars|5 years ago|reply

Shall we talk about the name? Or does nobody care about the impact of subconscious bias in adoption of dev tools and ai should shut up ? :-)

[+] kroltan|5 years ago|reply

What about it? Searching for "pijul -vcs" shows images of birds, which is reasonable?

[+] dxdm|5 years ago|reply

Even if we disregard the riddle of its meaning, I have no idea how to pronounce the name.

Pee? Pye? Then, a "jay" sound, or a Spanish "junta", or a French "Jules"? And then how is the "u" pronounced?

I'm confused. Pidgel?

[+] gnusty_gnurc|5 years ago|reply

Yea I was searching to see if anyone else would mention this, especially seeing the uproar about GIMP, etc. I don’t care about the naming dispute stuff, but this is comically close to pee-hole.

[+] songqin|5 years ago|reply

what's wrong with the name?

[+] Jyaif|5 years ago|reply

"One of Pijul's goals is to minimize the number of commands" Shut up and take my money.

[+] jgalt212|5 years ago|reply

This write up is very kind to Git (vis a vis one of it pros):

> Intuitive method and interface for version tracking

[+] pmeunier|5 years ago|reply

https://www.reddit.com/r/programming/comments/k39td1/pijul_t...

[+] Technically|5 years ago|reply

Does this allow for semantic, non-text diffing? This would allow for code formats without syntax errors (i.e. storing the valid AST in binary as separate, likely structured in binary, from the editor representation).

Diffs would then be truly semantic, representing semantic refactorings (e.g. "extract method to function", "rename symbol") at the patch level. you could easily then query "when did this variable show up" even post-refactor where the variable moves modules. Sure this has its limits but it'd push the tooling to levels virtually impossible to automate with text-based code now.

A man can dream!

[+] pmeunier|5 years ago|reply

It does allow that, to some extent: by changing the diff algorithm. On thing you can do in a diff algorithm for Pijul (not written yet, but totally possible) is to treat whitespace as their own binary blocks, so that reformatting commutes with other changes.

[+] gnufx|5 years ago|reply

Darcs has "replace" for that, the only other patch type that's been implemented for it as far as I know. I think Toolpack provided that for Fortran in the 1980s, but I don't remember for sure whether its version control was AST-based latterly (and, of course, it wasn't networked, let alone distributed).

64 comments