top | item 41656800

(no title)

gorset | 1 year ago

Patch series comes from the linux kernel workflow, which git was developed to support. https://kernelnewbies.org/PatchSeries

In this workflow you review every commit and not just the branch diff. Each commit is crafted carefully, and a well crafter series of commits can make even very large changes a brief to review.

It takes a certain skill to do this well. As the page above says > Crafting patches is one of the core activities in contributing code to the kernel, it takes practice and thought.

This is in contrast to using git more as a distributed filesystem where you don't care particularly much about the history, and you typically squash commits before merging etc. It's simpler and easier to work this way, but you lose some of the nice attributes of the linux kernel workflow.

discuss

order

keybored|1 year ago

That’s a nice summary.

What I don’t like about the Git documentation as I’ve read it is that they go between “patch” and “commit” in some places without stopping and explaining what the difference is. It makes sense to them. It’s obvious. But it isn’t necessarily obvious to most people.

A patch is a patch proper plus a commit message encoded in a format that git am understands. That’s fine. And the core developers understand that you cannot transmit a commit snapshot via email (or you shouldn’t). But I prefer to mostly stick to “commit” in the abstract sense, whether that to-be-commit is from a pull or from an email (or: it’s in the form of an email and it could be applied as a commit).

git rebase talks about “patch series” I think. Without explaining it. Why not “commit series”?

Sometimes it seems like talking about your changes by the way it happened to be transmitted. It’s like talking about “attachments” instead of commits because you happened to send them via email as an attachment (instead of inline).

Then you now have “stacked diffs” or “stacked commits”. Which are just a series of commits. Or a branch of commits (implicitly grounded by a base commit). For a while I was wondering what stacked diffs/stacked PRs/stacked patches and if I was missing out. When it just turned out to be, as you explain, essentially the Linux Kernel style of being able to review a commit in isolation. But in a sort of context that pull request inhabitants can understand.

I prefer to mostly talk about these things as “commits”.

(At several times writing those paragraphs above I wondered if I would be able to string together them in a coherent way)

travisb|1 year ago

I think part of the confusion is because 'patch' and 'commit' (really snapshot) are duals of each other, but in practice have important technical differences. When speaking abstractly about 'changes' it often doesn't much matter which term is used, but most interactions are with 'commits' so that tends to be the default term to use.

However, sometimes the details matter. For example, a 'patch' (diff + description) tends to be small enough to transfer conveniently, and human friendly. Patches do not describe their relationship to other patches, so it makes sense to talk about a series of patches which must be applied in sequence to accomplish some larger goal. 'patch' doesn't imply any particular storage format, so sometimes saying 'attachment' or 'patch file' or 'email body' is an important distinction. Git documentation assumes you know what you want. It might help to think of patches as "outside time", though of course any particular version of a patch will only apply to a subset of all snapshots.

A 'commit' (snapshot + description), on the other hand, tends to be large in that it describes the entire state of the codebase and contains (implicit or explicit) metadata describing its predecessor(s) -- possibly a large graph of them. People talk about a 'commit series' all the time, but use the terms 'history' and 'branch' instead.

Stacked PRs are built on top of commits and branches to gain the advantages of a patch series while retaining the advantages of snapshots. There's no industry consensus on if they are preferable, let alone how to best implement them (eg. rebase, rebase+squash, merge-squash, merge+first-parent, etc.), so different people have different ideas about what they look like. It isn't correct to say they are just a series of commits, because sometimes they are implemented as a series of (implicit or anonymous) branches. One of the few agreed-upon features of stacked PRs (or its other names) is that it is a sequence of 'changes' which are ordered to both satisfy change dependencies and broken into smaller pieces which "tells a story" to reviewers.