top | item 30258530

Django: Reformatted code with Black

284 points| tams | 4 years ago |github.com

245 comments

order

samwillis|4 years ago

I believe from memory Django decided to move to using Black back in 2019 [0] but delayed the change until Black exited Beta. Black became none beta at the end of January [1].

This was finally merged to the main branch today [2].

I suspect there are lots of other both open source and private projects that are also making the change now. This is a show of confidence in Black as the standard code formatter for Python.

0: https://github.com/django/deps/blob/main/accepted/0008-black...

1: https://news.ycombinator.com/item?id=30130316

2: https://github.com/django/django/pull/15387

dwightgunning|4 years ago

This is right. Black emerging from beta was discussed on the Django mailing list in the last week or so, and triggered the work.

bwhmather|4 years ago

Shameless plug: For people who like black, I've been working on ssort[0], a python source code sorter that will organize python statements into topological order based on their dependencies. It aims to resolve a similar source of bikeshedding and back and forth commits.

0: https://github.com/bwhmather/ssort

drcongo|4 years ago

This is relevant to my interests. We have an internal code style guide at my company that includes guidelines for order of class statements, roughly matching yours. I have one pet peeve that made me write the style guide in the first place - Django's `class Meta` which we always have at the top of the class because it contains vital information you need to know as a programmer, like whether this class is abstract or not. Whenever I have to work with an external Django codebase and find myself scrolling through enormous classes trying to find the meta my blood pressure rises.

atoav|4 years ago

Some illustrative before-after syntax-highlighted code segments would be a nice addition for the readme.

VWWHFSfQ|4 years ago

Looks cool but it seems like it might still need some work?

I tried it on one of my Django `admin.py` files and it created NameErrors.

    class TestAdmin(admin.ModelAdmin):
      list_filter = ("foo_method",)

      def foo_method(self, obj):
        return "something"

      foo_method.short_description = "Foo method"

    # It turned it into this:

    class TestAdmin(admin.ModelAdmin):
      list_filter = ("foo_method",)

      # NameError
      foo_method.short_description = "Foo method"

      def foo_method(self, obj):
        return "something"

stjohnswarts|4 years ago

This sounds like a living hell if you use git diff a lot to compare for small changes that might introduce a bug? which is what happens at work all the time since our unit test and CI are a joke. Not dumping on your project but the idea of that much of a change up of the code scares the dickens out of me.

aaronchall|4 years ago

Does it put high-level business logic at the top or the implementation details at the top?

Which is preferable, and why?

BiteCode_dev|4 years ago

Very interesting, especially the method order part. I dislike the order you chose, and yet, I would be tempted to use it on my projects anyway, because being congruent is so important to me.

BeFlatXIII|4 years ago

Thanks for sharing this. When solo coding, I tend to dump new classes and functions wherever is physically closest to where I was previously editing. It makes sense in the moment so I don't disrupt my train of thought by jumping all over the file, but then is a confusing ball of mud when I need to return to the project after time off. Was the shortest scroll direction up or down when I implemented it? etc…

jansky|4 years ago

this is great. Imagine i declare global variable which is used in function which is defined AFTER this global variable is declared (filled by value) and then function is executed later. Why does ssort put my declaration/filling of global variable before that function declaration?

def myfunc(): global globalvar str(globalvar)

globalvar='abc'

myfunc()

will be transfered to

globalvar='abc'

def myfunc(): global globalvar str(globalvar)

myfunc()

I understand why is it done but i dont want to have function definition block filled with this declaration of variables (which i do later) since it has no impact to my code and it makes is just a bit "cleaner". Dont tell me to not use global variables :D

mulmboy|4 years ago

Sounds interesting and perhaps novel. Might help if there were an example or two in the readme - as it is I still don't exactly know what this is.

dopeboy|4 years ago

Very cool - I'll be following this.

mrtranscendence|4 years ago

I've been using black at work for over a year now. I don't much care for some of the choices it makes, which can sometimes be quite ugly, but I've grown used to it and can (nearly) always anticipate how it will format code. One nice side effect of encouraging its use is how, at least where I work, it was very common to use the line continuation operator \ instead of encompassing an expression in parentheses. I always hated that and black does away with it.

What I don't much care for is reorder-python-imports, which I think is related to black (but don't quote me). For the sake of reducing merge conflicts it turns the innocuous

from typing import overload, List, Dict, Tuple, Option, Any

into

from typing import overload

from typing import List

from typing import Tuple

from typing import Option

from typing import Any

Ugh. Gross. Maybe I'm just lucky but I've never had a merge conflict due to an import line so the cure seems worse than the disease.

Edit: Just to be 100% clear: this is python-reorder-imports, not black. I thought they were related projects, though maybe I'm wrong. Regardless, black on its own won't reorder imports.

Joeboy|4 years ago

    from typing import (
        overload,
        List,
        etc...,
    )
would seem more sensible to me. I know you can make isort do that, I guess maybe not black.

magnusmundus|4 years ago

Really? I just put that exact line in a file I'm working on, and black didn't change anything. Maybe you mean in case it exceeds the line length limit, rather than that specific example.

In any case, you can wrap those in parentheses, in which case black will just enforce its usual tuple formatting: single line if it fits; one line per item if not, with a trailing comma.

edit: I tried it on a long line with a backslash break, and black wrapped the imports in parentheses like I suggested above. I wonder what causes the behaviour you see on your end.

heavenlyblue|4 years ago

Why do you even care? I never look at that part of the code. If PyCharm automatically removed/added imports without me managing them I would be a happier person.

BeFlatXIII|4 years ago

I have a bad habit of

from typing import *

because I get annoyed at having to change my imports each time I need a new type in my type annotations.

steve_taylor|4 years ago

At least it doesn’t have a dishonest name such as Prettier, which turns perfectly good looking code into digital vomit.

VBprogrammer|4 years ago

Reading some of the comments here it's become clear to me that the next stage in the development of auto-formatters is to have the formatter commit the code as a canonical format but to display the code to each individual contributor in the style of their choosing. Thus removing all kinds of arguments about whether 80 or 120 columns is the one true width.

mulmen|4 years ago

You're absolutely right of course.

But even now I have to adapt on screen shares when my coworkers are using dark themes before sunset. Can't imagine what that would be like with different code formatting. So the next next step is to display their desktop with my theme.

IOW collaboration tools still have a long way to go.

whalesalad|4 years ago

I think this is the most wonderful part of Lisp. Specifically its homoiconicity, or the fact that the syntax of the program is the program, and yet the syntax (as far as linebreaks, indentation, spaces vs tabs, etc) is completely irrelevant to the meaning of the code.

Ostensibly you could craft a future where what is on disk is not what the user is actually editing - a-la the virtual DOM. And on read/save the developer's preference is used to transform the syntax into their ideal shape. This is trivial in a Lisp, but not so easy in other languages.

wyuenho|4 years ago

I'm pretty sure you can already do that with some scripting. Just write a git alias, say git edit, which will run the work tree copy through your favorite formatter to a temp file and send that temp file to your editor, and a commit hook to rename all the temp files back to the original and format them back to whatever the canonical format is. You can also configure your git diff and stash etc to make them aware of the temp file naming convention. You might even be able to write a script to generate all these aliases. There are some annoying details such as needing a separate command to temporarily move the temp files to the work tree to give your IDE a hand, but totally doable. It's going to take maybe a few weeks of work, but doable for a single person.

bredren|4 years ago

This is the way. I had not considered this before reading your comment.

If this system results in syntactically identical code [1] it should not matter if it’s displaying for you differently [2] if it means you can read or write around or in it more comfortably it’s just a hairstyle.

I was asked to familiarize myself with Replit the other day and it seemed the editor defaulted to two spaces for Python. Two spaces?! I changed it to four.

A friend joined my session and began to code with me, their editor was in the default two space indentation. It was madness.

[1] This seems like is a decent sized presumption across many languages and versions.

[2] This seems like an interesting AI problem, showing code structures you’ve never used in your style you’ve never defined.

gfunk911|4 years ago

You brilliant lunatic

dom111|4 years ago

I've been thinking about this for a while too.

I think that making editors do this is within the realms of feasibility. Most support auto-formatting to your preferred style so it doesn't feel like a leap for it to format to your preferred style but keep the file on disk the project owner's preferred style. I haven't looked extensively to see if this already exists though but we chatted about this at work as I was advocating for use of prettier on a front-end project!

TheRealPomax|4 years ago

The reason to use Black is the same as Prettier on the HTML/CSS/JS side: forever stop having an opinion on code style, it's wasted time and effort. Any "it's not exactly what we want" comment with an attempt to customize the style to be closer to "what we were already using" is exactly why these things exist: by all means have that opinion, but that's exactly the kind of opinion you shouldn't ever even need to have, tooling should style the code universally consistently "good enough". Which quotes to use, what indent to use, when to split args over multiple lines, it's all time wasted. Even if you worked on a project for 15 years, once you finally add autoformatting, buy in to it. It's going to give you a new code style, and you will never even actively have to follow it. You just need to be able to read it. Auto-formatting will do the rest.

wraptile|4 years ago

Except Python is a general purpose programming language so it's hard to have 1 shoe fits all solution when style vary based on medium you're working with. Are you making an OOP GUI app? Django? Something that is using loads of long Xpaths?

declnz|4 years ago

Aside: I love a good linter, but as a long-time Python fan I find it sad that Black has so little configuration (yes, I know, but still) and moreover that it often produces code that no human Python dev I know would write...

Python was always meant to look concise / beautiful... (MyPy has also made this trickier too)

Kinrany|4 years ago

People conflate opinionated formats with autoformatting for some reason.

An autoformatter removes 99% effort from formatting code, and that includes code actively being worked on. Autoformatters are incredibly useful.

A standardized format removes effort spent learning to read a new format. That's an hour per format at most.

I don't see any good reasons for an autoformatter to enforce a standard. A standard would work just as well if defined as a specific configuration.

rob74|4 years ago

Well, you'll be surprised to find out that gofmt has exactly zero configuration. Ok, they (wisely in my opinion) decided not to mess with breaking lines automatically, and the job was far easier to do with a new language than with an already-established one where most developers have their long-treasured preferences.

crad|4 years ago

I'll take yapf --style=pep8 formatting over black any day.

alecbz|4 years ago

OOC what are your grips with black's style? I generally find black pretty "beautiful" (concise maybe not as much).

wyuenho|4 years ago

Every time I was tempted to do something like this, I hesitated because I didn't want every other line in every file with my name on a single commit, mostly to avoid making git blame harder than necessary. It would be nice if there was a kind of diffing algorithm that can diff code units *syntactically* across history.

rurp|4 years ago

Not everyone uses PyCharm, but if you do it's really easy to highlight a specific code block and look through the git commit history for that section. I've used it many times for this exact type of problem, trying to find when the last substantive change happened.

To do this just highlight the block, right click, and choose Git > Show History for Selection.

nickysielicki|4 years ago

The best way to do this is to rewrite history with git filter branch / etc and rerun black at every commit. Then everyone nukes their clone and you continue on with the best of both worlds.

The only real downside is you nuke your issue tracker at the same time.

timhh|4 years ago

In my experience it's better to just bite the bullet and do it. Eventually you will do it, so you either screw up git blame for a small codebase with a small amount of history, or wait until it is a large codebase with a large amount of history to screw up.

> It would be nice if there was a kind of diffing algorithm that can diff code units syntactically across history.

There have been quite a few attempts at that though I've only seen them applied to resolving merge conflicts. It would be interesting to try them for blame too.

OJFord|4 years ago

Does the user matter? As long as the commit message is something sensible like 'Autoformat with black' it can be easily ignored when seen, and you can avoid seeing it with blame as simonw suggests.

throwthere|4 years ago

On the flip side you can get an intern to commit. /s.

Probably best to just make a one time git user to do it.

tomp|4 years ago

worst things about Black:

- doesn't respect vertical space - sure, making the code fit on screen might be valuable (though the default width should be at least 120 characters, I mean we're in 2022 after all), but Black does it by blowing up the vertical space used by the code

- spurious changes in commits - if you happen to indent a block, Black will cause lines to break

- Black fails at its most basic premise - "avoiding manual code formatting" - because a trailing comma causes a list/function call to be split over lines regardless of width

throwaway894345|4 years ago

> oesn't respect vertical space - sure, making the code fit on screen might be valuable (though the default width should be at least 120 characters, I mean we're in 2022 after all), but Black does it by blowing up the vertical space used by the code

This is fine with me--I think it makes sense to optimize for readability, and I can read a long vertical list of arguments a lot more readily than a long comma-delineated list.

> spurious changes in commits - if you happen to indent a block, Black will cause lines to break

Is this a generic argument against wrapping lines, or am I misunderstanding something?

> Black fails at its most basic premise - "avoiding manual code formatting" - because a trailing comma causes a list/function call to be split over lines regardless of width

I'm not following this either. If black automatically reformats your code over multiple lines, that doesn't suggest manual formatting. Maybe you're arguing that all code which produces a given AST should be formatted in the same way--this would be cool and I would agree, but black gets us 95% of the way there so to argue that it "fails" is to imply that "0%" and "<100%" are equivalent.

saila|4 years ago

I have to bump up my font size a bit and find 120 characters too wide on a 27" monitor where I need to look at multiple things side by side. It's also harder to read even when viewing a single file.

IMO, < 80 is ideal where possible with an absolute maximum of 99. I think Black's choice of 88 (plus maybe a little more in special cases) is quite good.

skybrian|4 years ago

It's odd that nobody followed Go's formatter in letting developers break lines themselves and mostly fixing indentation and spacing. I thought they made good choices.

wodenokoto|4 years ago

> - Black fails at its most basic premise - "avoiding manual code formatting" - because a trailing comma causes a list/function call to be split over lines regardless of width

Yeah, this one drives me nuts too.

digisign|4 years ago

My monitor is in portrait mode. Even when I used one in landscape, I typically had two windows side by side. So extra-wide lines of code are less readable.

polote|4 years ago

> - Black fails at its most basic premise - "avoiding manual code formatting" - because a trailing comma causes a list/function call to be split over lines regardless of width

Ah that's why `manage.py shell` now split json pasted on several lines, very annoying

codingkev|4 years ago

A little shoutout to a alternative Python formating tool https://github.com/google/yapf (developed by Google).

The built in "facebook style" formating felt by far the most natural to me with the out of the box settings and no extra config.

timhh|4 years ago

I did a blind survey of YAPF vs Black at my work. The results came back as 70% in favour of Black.

Black gives generally nicer output, and also more predictable output because its folding algorithm is simpler. YAPF uses a global optimisation which makes it make very strange decisions sometimes. Black does too, but much less often.

There are also non-style problems with YAPF. It occasionally fails to produce stable output, i.e. yapf(yapf(x)) != yapf(x). In some cases it never stabilises - flip flopping between alternatives forever!

Finally it seems to have very bad worst case performance. On some long files it takes so long that we have to exclude them from formatting. Black has no issue.

In conclusion, don't use YAPF! Black is better in almost every way!

lelandbatey|4 years ago

YAPF is slower than Black for many degenerate cases, a fact I notice most strongly since I use an "auto-format file on file save" extension in my editor. The case I found in particular was editing large JSON schema definitions in Python, as they're represented as deeply nested dictionaries. Black seems to format them in linear time based on the number of bytes in the file, while YAPF seems to get exponentially slower based on the complexity of the hard-coded data structure. It was a niche case, and the maximum slowdown was only ~1-2 seconds, but that editing freeze was quite annoying.

BiteCode_dev|4 years ago

yapf is configurable, and that's why it never won.

daenz|4 years ago

I'm so happy that languages are settling more and more on heavy reformatter usage. I'd like to think it was triggered by Go and gofmt. Working on a team where each engineer has their own personal syntax is not fun.

belval|4 years ago

Indeed, I don't like Black's style, but I prefer working in a Black codebase than one where everyone has their own preference. Having style guidelines in a team is also a great way to remove pointless debates when reviewing PRs.

kaesar14|4 years ago

Go and gofmt definitely pushed a lot of the momentum of the current wave but don't forget to give respect to Ruby / Rubocop where it's due, where the adage of Convention over Configurability has reigned supreme for decades.

MisterTea|4 years ago

> Working on a team where each engineer has their own personal syntax is not fun.

Why did your team not implement a style guide? Not following style is not working as a team and this needs to be addressed.

glacials|4 years ago

Black is slowly creeping into gofmt-level universality in the Python community and it’s great. The next big milestone is a first-party recommendation by python.org itself.

VWWHFSfQ|4 years ago

I'm pretty sure it's a PSF project

spc476|4 years ago

No, the next big milestone is embedding the format style as the syntax of the language. I'm curious as to why Go didn't even do this (they should have, in my opinion, but wimped out and left it to an external tool).

ibejoeb|4 years ago

In general, what are the strategies for large public codebases like this to mitigate supply chain attacks or other source-level attacks?

For clarity, I'm hoping to open us discussion about how we're dealing with massive changesets like this that are difficult to review due chiefly to the breadth of it.

sciurus|4 years ago

For a purely mechanical change like this, someone could run black against the same revision of Django and verify the changes they see locally match the changes in this PR.

fritzo|4 years ago

Interesting! Can you help me imagine attack scenarios? All I can think of is:

- The changeset is authored by a trusted committer but the committer's tools have been locally compromised.

- The public tool itself (e.g. black) has been compromised to automatically create vulnerabilities in difficult-to-review bits of code (a Ken Thompson hack).

jamessb|4 years ago

As a reformatting tool should only change the formatting, you could check that the Abstract Syntax Tree is unchanged. The ast module in the standard library gives access to the AST [1].

[1]: https://docs.python.org/3/library/ast.html

justinmchase|4 years ago

The output does look better but this also just looks like every PR for applying a linter / formatter I've ever seen. Not sure why this is news worthy.

simonw|4 years ago

It's a significant milestone in the adoption of Black by influential projects within the Python ecosystem, which makes it a good hook for discussing the idea that Black, now stable, is becoming established as the standard for code formatting for Python.

owaislone|4 years ago

Using black is not about how the code looks but to eliminate an entire suite of review comments/discussions. Everyone simply runs black over all code before submitting and no one ever comments about how anything is formatted.

VWWHFSfQ|4 years ago

So now when you look at the annotated change history all you're going to see is a bunch of changes by the person that reformatted the code instead of the person that wrote it.

Cthulhu_|4 years ago

There's workarounds that others have mentioned, but indeed, the unfortunate side-effect of deciding to apply a formatter is a 'formatting' commit, causing a lot of code churn and issues if naively using git blame.

But, it's a "rip the plaster off" kinda thing, because it should ensure a lot less churn, inconsistent code style, or arguments and reviews about formatting after this is merged. It frees up a lot of headspace and distractions in code reviews. I don't know about you, but when I did code reviews I'd always end up zooming in on code style issues - ' vs ", things on newlines or no, JS objects with stringed keys, etc.

alecbz|4 years ago

Uh so is your take "don't do broad refactors ever?"

Beyond `.git-blame-ignore-revs` (which is neat and TIL), in GitHub's web viewer, if you find the line you're interested in and see that the most recent PR is a reformat, you click the "view blame prior to this change" button. I think most blame viewers do (or at least should) have a feature like this.

justinmchase|4 years ago

You can see both of course. That's the beauty of history.

rowanseymour|4 years ago

I love this except the use of the default black line length of 88. One of the things I appreciate about gofmt is being trusted with deciding on line breaks.

NAHWheatCracker|4 years ago

I suggested Black to a team I was on a year ago and one developer hemmed and hawed about how he likes to format arrays or something. I didn't win any friends by pointing out that disregarding those personal preferences is part of why I was recommending it.

A year later and it seems to be the default on all projects I'm working on and I'm loving it.

themeiguoren|4 years ago

Autoformatters are hell for 2d arrays of data where the columns have meaning and you want them to be aligned (time series, matrix math). It’s my only real gripe.

jnothing|4 years ago

Why is it impossible to rebase? I didn’t understand the conversation around merging and rebasing

vitorfs|4 years ago

This is such a great news. We've been using Black in the company that I work for the past 3 years or so and it was a game changer for code reviews. Hopefully other open source Python/Django projects will follow the lead.

umvi|4 years ago

What's the point of putting linters into CI? Is the point to fail the build if the code wasn't pre-formatted with i.e. Black? Or is the point to autoformat and autocommit the formatted code?

bckr|4 years ago

> Is the point to fail the build if the code wasn't pre-formatted with i.e. Black?

It's this one

> Or is the point to autoformat

This one is done with pre-commit (which should probably be named pre-push?) hooks

> and autocommit the formatted code?

I don't think this one is done, and I think it's undesirable

selestify|4 years ago

> Is the point to fail the build if the code wasn't pre-formatted with i.e. Black?

It's this. Ensures that anything merged to master keeps the formatting conventions established in the project.

seattle_spring|4 years ago

The former, in my case. Last thing I want is someone merging their own "creative interpretation" of proper formatting.

euler_angles|4 years ago

Had a great experience with black. Only thing I did was change its default line length limit to 120 characters (I was regularly dealing with signal names from source data that were about 90 chars).

wolverine876|4 years ago

Do Black and other autoformatters enable significantly more reusable code and computer-generated code? Formatting is certainly not the only or greatest barrier, but if format is standardized across projects, it's easier to plug and play code from outside.

ReleaseCandidat|4 years ago

I would really appreciate if there would exist exactly _one_ formatter (without any options) per language.

It is way better to deal with ugly formatting as long as it is consistent than with discussions where to put a closing brace/bracket/paren.

MahajanVardhan|4 years ago

I am so sorry, but what is Black? I use django but I have never heard of Black

SoylentOrange|4 years ago

I’ve been using black for about a year and I’m generally a big fan. However my biggest gripe with it is bad VS Code integration.

claytonjy|4 years ago

bad how? i use vscode, I save a file, it reformats on save, that's it.

supreme_berry|4 years ago

“Black” developer refused for a long time to add option to format code with single quotes with very aggressive manners. Now Django devs didn’t see that option for single quotes and code looks unpleasant.

vitorfs|4 years ago

I have always used single quotes for Python code since I start working with it. When I started to adopt Black on my projects it indeed felt weird and the code looked unpleasant. But after a while you get used to it.

Some people make the case that it's easier to write single quotes (well, depending on the keyboard format anyway). For keyboards in the US standard you have to hold the Shift key to write a double quote. But the good thing about Black is that you can still write your code using single quote and when you run the command line utility it will fix/normalize the code to use double quotes.

Nowadays I got so used to it that I even write my Python code using double quotes. And looking at Python code using single quotes looks weird/unpleasant for me.

INTPenis|4 years ago

I reacted to this too, in the changed files tab.

Technically single or double quotes have the exact same meaning in Python. What makes people use single quotes is probably other languages like PHP, Perl and Bash.

I know I've made it a habit to default to single quotes unless I know I need double quotes. So that might be where the habit comes from in the Django project. But it's not actually necessary in python so might as well use the most commonly used type of quote.

pchf|4 years ago

To keep the single quotes, which in my opinion make the code less cluttered and closer to the REPL, I use the pre-commit hook double-quote-string-fixer, in conjunction with black's option skip-string-normalization set to true.

digisign|4 years ago

Use nero or blue instead, which both use single quotes.