top | item 20412220

Literate Commits (2016)

52 points| pcorey | 6 years ago |petecorey.com

14 comments

simonw|6 years ago

I went through a phase a few years ago of writing ESSAYS in my commit messages, on the basis that they were the only form of documentation which I trusted to stay 100% synchronized with the code.

Eventually I realized that there's a better way to do this: keep your documentation in the same repo as your code, and construct commits that update the documentation AND the tests AND the code all in the same unit. Now your commit message can be much shorter, but you still guarantee that the documentation is exactly aligned with the code that it talks about.

It also means you can add unit tests that check that code is covered by your documentation! https://simonwillison.net/2018/Jul/28/documentation-unit-tes...

lytedev|6 years ago

Just to bring up a neat point, this doesn't guarantee that the documentation and the code are the same, as you could change code without updating documentation.

A very slick way of enforcing this that I've seen is having actual unit tests in your documentation (and of course a CI tool to enforce that tests pass). This is something that I think is extremely cool about Elixir! It forces you to think about testing, documentation (with ACTUAL examples!!), and the problem you're solving all at once. It's a very neat concept.

https://elixir-lang.org/getting-started/mix-otp/docs-tests-a...

j88439h84|6 years ago

Why put the class docs in a separate .rst instead of just leaving it in the class docstring? I suppose there's syntax highlighting, though that seems like an editor issue.

mannykannot|6 years ago

One of the biggest difficulties in understanding a code base is in identifying the implicit contracts and constraints that exist between the various parts. Each time a feature is added, a fix is made, or some aspect is refactored, some information about the set of things involved in the implementation of that aspect is revealed. If your commits are usually each about one thing only, and their comments simply allow you to identify all the commits needed for a given change, you have already gone a long way towards making that information useful - it's the single-responsibility principle applied in a different area.

shadowfiend|6 years ago

I poked at this concept myself with a short-lived old side project called sophocles: https://github.com/Shadowfiend/sophocles . At the time I was experimenting with displaying the diff and the commit message side-by-side, as well.

This post says:

> I’m not advocating using literary commits in real-world software

But I think variations of the underlying thinking are very important for real-world software, even if it's not the exact same thing. Brain dumped this around the time I was working on sophocles at http://shadowfiend.posthaven.com/using-commit-messages-for-d... , and it's still a core part of how I develop software and how I want the teams around me to as well.

tobr|6 years ago

This seems very sensible, and completely at odds with the common “best practice” of keeping commit messages short. I find that the best way to understand why a piece of code is the way it is is not through comments in the code, but by looking at history and commit messages. They are comments where it’s completely clear what code they describe, even if it’s spread out across the code base, and even if the code has since been changed to the point where the comment is no longer applicable.

UK-AL|6 years ago

Commit titles should be short.

Commit messages are often small essays. Especially in the Linux kernel

JoelMcCracken|6 years ago

I've thought about this a good bit independently, and I want to be able to do a few things:

- Add line-change/diff-specific commentary. This needs to be within the context of a larger commentary/set of changes, as usually the changes are not independent, and in fact relate to simultaneous constraints which inform your chosen solution.

- Add some structure to a list of commits in a branch. I'd like a hierarchy, in fact. I've thought about trying to emulate this by making sub branches and then merging them into a parent branch, but it seemed like it would be a big pain.

Anyway, I've love to see some software that could help with this!

jostylr|6 years ago

I wonder if the Peter Norvig comment mentioned in that article about the different pathways is what led to my idea of stepping past literate-programming and into a concept I call pieceful-programming; I had just had back surgery when I read the John Cook article and perhaps it filtered deep in the subconscious.

While I am still working on my pieceful implementation, the hope is to very much elevate the role of the pieces and ideally generate graphs relating to all the pieces that might have code, tests, various documentations related to the pieces, all being woven into their separate compilation targets.

stinos|6 years ago

Sort of an interesting idea, and I wholehartedly agree with

ensuring that each and every commit serves a singular purpose and adds to the narrative history of the project has done wonders to reduce thrash and the introduction of “stupid mistakes”.

Read in chronological order, these commits should paint a clear picture of how and why this particular code came into existence and how it has changed over the course of its life.

I'm not sure if this is particular to 'literate', to me it sounds more like common sense, but yes: that is how I like commit history.

The example https://github.com/pcorey/delete-occurrences-of-an-element-i..., even though the OP mentions it's a toy, illustrates this pretty well. Without knowing much about the code you can just read the commit messages and even though the style of the text is a bit too fuzzy for my liking at times (i.e. lots of words just to form nice prhases to make points which could be made with shorter and more to-the point sentences) there is a much larger problem (imo): the first line of each commit makes no sense whatsoever wrt what actually changed in the code. This makes it impossible to at first sight (in git user interfaces which use the first line in their commit lists, i.e. the vast majority I think) find out what is happening and what a particular commit does. If that is an articfact of literate commits, then sorry but that is just not very good. Here's an excerpt:

    # Getting Real
    # Forging Ahead
    # Keep it Simple
    # Our First Test
    # Take What We're Given
    # Laying the Groundwork

Suppose you come back to this knowing you changed something in a recent commit and want to look it up. My strategy is then usually to just glance over the history for a few seconds to see if I can immediately find the commit I'm looking for before starting an actual search. Works very, very often. Not so much in this case. Compared to what I'm looking at right now in a repository of a team I work with:

    converters: Fix nullable double getting converted into NaN
    plot: Use assembly reference instead of project reference
    plot: Update code style and formatting
    network: Add functions for element acces to Python scripting API

The body of the commit message then goes on describing more detail and usually also why the change was made. As such creating the narrative the OP talks about. But with the added benefit that the message header provides pretty good insight into what a commit does, hereby providing some kind of index to the book being written, making it much easier to discover things.

pcorey|6 years ago

Yeah, I agree that the subject of those commits are far from helpful or descriptive. To be honest, they were written as cheeky subheaders for the generated example article, and don't even do a job there.

prothinker|6 years ago

[deleted]