top | item 12397246

Tabs or spaces – Parsing a 1B files among 14 programming languages

70 points| nikbackm | 9 years ago |medium.com | reply

122 comments

order
[+] jasode|9 years ago|reply
After 20+ years of listening to the tabs-vs-spaces debate and considering all the legitimate points that both sides have, many have made the following observation and it's what resonates with me the most:

In an ideal perfect world, _all_ of programmers and _all_ text editor tools would use tabs specifically for indentation and spaces specifically for alignment. But, we don't live in that perfectly coordinated world so spaces maintains the most fidelity -- at the expense of programmers not being able to instantly customize the indentation from widths of 2,4,6,8.

Therefore, the slight edge goes to spaces even though I find tabs extremely attractive. (That conclusion is in the context of a team of multiple programmers, using multiple languages, multiple text editors. If you're solo and can maintain "tabs discipline", that's a different scenario.)

[+] spinningarrow|9 years ago|reply
Personally, I don't find the value in alignment - indentation is all I do. I especially dislike alignment of this sort:

    var a_variable           = 1;
    var another_variable     = 2;
    var yet_another_variable = 3;
which is just too fiddlesome and in fact makes it _harder_ for me to read.
[+] louthy|9 years ago|reply
I came to this conclusion too after at least 20 years of using tabs; but mostly it was because of moving to white-space significant languages like Haskell, F#, etc. It was just too painful getting everything lined up with tabs and I needed the fidelity of spaces.

Previous to that with C/C++/C# the alignment usually took care of itself through the closing brace and IDE auto-formatting. So tabs was the natural unit of currency there.

Horses for courses I guess.

[+] gepoch|9 years ago|reply
Unless you're using golang,in which case gofmt will correct you. Hence the overwhelming consensus of the go row for tabs.
[+] CJefferson|9 years ago|reply
My other personal feeling is that it's easy to auto-check we aren't using tabs (just search for tabs, and reject if any are found).

Depending on how you choose to format, it can be arbitrarily hard to check tabs are being used correctly.

For example, many people want to do some space indenting, such as:

    <tab>void my_function(int arg1, int arg2,
    <tab>                 int arg3)
    <tab><tab>{
It's (I believe) impossible for a language-agnostic tool to know this is "correctly tabbed", but (for example)

    <tab>void my_function(int arg1, int arg2,
    <tab>                 int arg3)
    <tab>    {
Is not.
[+] Fuxy|9 years ago|reply
Honestly I don't care either way just a long as the file doesn't mix tabs with spaces.

I hate when the code looks perfectly aligned in my editor but is completely messed up in git because git renders tabs a different with.

I personally prefer spaces on my own projects because in the early days a tab used to be fixed at 8 spaces and I though that was a bit too much.

It's adjustable now but I don't think it was then or at least I wasn't aware it was.

[+] w_t_payne|9 years ago|reply
I've always used spaces instead of tabs.

My rationale:

The number of readers of any document will normally be greater than the number of authors. We should give a slight preference for readability and ease of comprehension over writeability, and should certainly not require that our readers adjust their editor settings for each document that they peruse.

In addition, it is useful to use vertical alignment to group related content that spans multiple lines. Although this at least partly an aesthetic choice, and not without drawbacks, I believe that the benefits outweigh the costs.

I believe that it looks neater, improves readability and also makes some errors "pop" out in a way that they don't in unaligned content. I find the ability to visually group related terms and expressions particularly useful.

The downside is that you have to "tidy up" after using refactoring tools or after doing a search-and-replace operation.

However, I believe that this downside is mitigated due to the fact that the very tidying up that is required is a good way of eyeballing the changes and checking to make sure that they are OK.

http://williamtpayne.blogspot.co.uk/2012/04/spaces-over-tabs...

[+] teddyh|9 years ago|reply
Neither!

http://nickgravgaard.com/elastic-tabstops/

A better way to indent and align code

When I saw this, it was obvious to me that this was the true solution to the problem which both space and tab proponents try to solve.

[+] nwah1|9 years ago|reply
Yes, this seems like the best approach. The key for this to gain support is for Github's diff system to support it.
[+] glandium|9 years ago|reply
Essentially, this is suggesting to implement in text editors what word processors have done for almost forever, with some additional automation.

It's sad that this hasn't seen any traction in the past 10(!) years. It seems so much better than the current mess.

[+] smrq|9 years ago|reply
I love this concept--I'd be using it daily, but last time I tried it the Sublime Text implementation was unusably slow.

Admittedly, I am pretty picky on what's unusable. Also, this page lists it as an "incorrect implementation".

[+] robgering|9 years ago|reply
I've been learning Go in my spare time. One of the things that I've found really refreshing is how the enforced conventions largely eliminate contentious discussions like tabs vs spaces.

For those unfamiliar with Go, the gofmt tool[1] converts indentation to tabs, standardizes brace positioning, etc. Most editors have a plugin that will automatically run the tool on save. An option for programmers using other languages is EditorConfig[2].

[1] https://blog.golang.org/go-fmt-your-code

[2] http://editorconfig.org/

[+] lethargic_meat|9 years ago|reply
Some IDEs will convert tabs to spaces when saving and the other way around when editing, so, while your approach is interesting, not really a proof of a majority choice.

Also my beef is specially with people who don't use tabs in files like fstab, like beasts.

[+] claudius|9 years ago|reply
For standard levels of indentation, I get that tabs may be useful, but in /etc/fstab, I absolutely hate them – how many do you put if the individual elements vary in size? Always just one, resulting in e.g.

    this_is_a_long_line<tab>defaults<tab>…
    short<tab>defaults<tab>…
with the following elements not aligning? Or however many are necessary to get things to align, which sort of destroys the semantic meaning?
[+] scrollaway|9 years ago|reply
I'd bet most of the code ever written gets styled however the IDE du jour decides to by default.

If some IDEs decided to switch to tab indent by default, people would suddenly love tabs.

[+] mherrmann|9 years ago|reply
I've never really understood why you would use several characters (=spaces) for something that is semantically one indent. Can someone enlighten me? What are the advantages of spaces over tabs?
[+] pyre|9 years ago|reply
Not all editors are smart about tabs and spaces, and when the two start to mix things get really funky if everyone doesn't have the same "1 tab = x spaces" value set on their editor.

For example:

      function_call(really_really_really_really_really_argument1,
                    really_really_really_really_really_argument2)
Say that this is supposed to be at one indent level. The beginning of both of those lines should contain only one tab, and the spacing that positions the second argument from the indent should be only spaces:

  <tab>function_call(really_really_really_really_really_argument1,
  <tab>SSSSSSSSSSSSSSreally_really_really_really_really_argument2)
If the text looked like above (each 'S' is a space), then one's editor could easily change whether or not a <tab> was displayed as 8 spaces or as 4 spaces, and the display wouldn't be affected. Many editors, however, consider all spacing at the start of the line to be indentation. If you were using one of those editors with tabs set to (e.g.) 4 spaces, the code might look something like this:

  <tab>function_call(really_really_really_really_really_argument1,
  <tab><tab><tab><tab>SSreally_really_really_really_really_argument2)
Now the code only looks the way it was meant to look if you have you <tab> = X spaces value set to 4. If you set it to something else, the code looks like a mess.
[+] michaelt|9 years ago|reply
Things that are supposed to be lined up stay lined up even if they aren't at the start of a line - even if you:

* Use different tools - IDE, command line, version control tool, code review tool, e-mail client.

* Have code that follows with different conventions - because you participate in several projects, or because you've imported third party code that follows a different convention, or because the code is written in different languages.

* Have co-workers with different opinions about what tab width should be set to.

Personally I don't think there's that much value in vertically aligning things, except in lookup tables. Usually it's a sign you've adopted line-length rules that don't reflect modern monitor sizes, or that your methods have too many parameters. But some people seem to think it's valuable.

[+] 67726e|9 years ago|reply
People always talk about consistent display of spaces whereas tabs might be shown as 4 spaces or even 8. Of course that means you can usually customize how tabs are rendered unless you code in Notepad or something crazy.
[+] CJefferson|9 years ago|reply
The big advantage, as other people discuss, that that different editors display tabs as 2,4 or 8 spaces, and I've never worked on a codebase which used tabs, and where viewing the code in different tab sizes didn't break the layout somehow.

If you use spaces, you can format code, knowing it will look right on other people's machines. Perhaps a sufficiently organised bunch of programmers could edit a large code base such that it viewed fine with 2,4 or 8 character tabs, but is it worth the pain?

[+] mbrock|9 years ago|reply
Semantically, the TAB character means something like "insert a pseudo-random number of spaces here" with that number often defaulting to some ridiculous value like 8.

I keep my programs less than 80 character wide for accessibility purposes but if I use tabs then the width is unpredictable.

Also, spaces are universal and trivial whereas the TAB button in many editors has strange behavior.

[+] philipov|9 years ago|reply

    def doSomething( parameter1
                   , parameter2
                   , parameter3
                   ):
        pass

How would you make all the punctuation line up using tabs?
[+] GoToRO|9 years ago|reply
With spaces you can position the cursor anywhere you want, with tabs you can not position the cursor in the middle of a tab.
[+] dvh|9 years ago|reply
I hereby declare end of discussion (at least for me), spaces because most other programmers use spaces. Unless you are joining the project that consistently uses tabs, then use tabs. Otherwise spaces.
[+] jasonkostempski|9 years ago|reply
If greater than 58% of all projects you work on mostly use tabs, then use tabs 91% of the time, in 97% of your projects.
[+] Tharkun|9 years ago|reply
Now, can we look at the same repositories and look at the number of bugs and correlate it with the use of spaces?

But in all seriousness, how often something is used isn't an indication of how good it is. Spaces are like cigarettes. Just don't start.

[+] guessmyname|9 years ago|reply
I couldn't care less about this, and I generally feel out of place when other programmers ask me about my preferences on this topic. I have written code in different programming languages, and just naming Go and PHP where Tabs and Spaces are the standard respectively — PHP mostly because of PSR — I simply don't pay attention to the character(s) used for the indentation, the tooling already does that for me, being gofmt for Go and PHPCS for PHP, and there is probably the same tools in other languages [1]. I never understood why people complained about this more than other (more important?) things in the code like the position of braces which is also a flame war between developers but it makes more sense than caring about Tabs vs. Spaces.

[1] I say "probably" because even when I have written in Vala, C++, Ruby, Python, JavaScript, I have always relied on the IDE to automatically select the most common indentation in the project, so I never realize if I am using Tabs or Spaces since hitting the Tabulator key while using spaces will simply translate them to the correct indentation.

[+] mcos|9 years ago|reply
The biggest problem with the spaces vs tabs debate is that editor presentation is still tightly coupled to file persistence. Imagine an abstraction layer created so that developers might choose to see what they wished, yet have files saved in a standard format it might negate some of the issues people have.
[+] mbrock|9 years ago|reply
That sounds like a really complex solution to a kind of non-problem.
[+] txutxu|9 years ago|reply
Some perl projects carried by teams, use perltidy as pre-commit hook.

It's not as cool as talk about newer languages, but it handles all you may need about code indenting and alignment. And about tabs and about spaces.

Today somebody was asking in planet Debian, about Haskell vertical code alignment... well, again, perltidy has an option to enable/disable that.

I've developed many years without using it... recently I discovered it in first person, now I cannot live without it :) has options even for vertical alignment of indented comments.

I enjoy seeing it in action, after an hour or two of coding $anything. It finds always more inconsistencies than I did expect. And I really _try_ to be consistent.

[+] TurboHaskal|9 years ago|reply
Perltidy is truly amazing. I haven't seen anything that comes close in terms of power and configurability.
[+] skoczymroczny|9 years ago|reply
Parsing files might not be the best way to measure programmer preferences, because in big projects programmer's preferences will be squelched by the coding standard and/or tab/space cargo culting.
[+] rsaarelm|9 years ago|reply
I wish he'd have also counted the files that use only spaces or only tabs compared to a mixture of both in the indentation.

My reason for being against tabs is that unless you have something like gofmt, somebody will inevitably screw up and put in indentation that mixes tabs and spaces.

The second thing to check would be tabs that aren't in the initial indentation whitespace of the line. The other inevitable screw-up is using tabs to do some kind of vertical layout that shows up right with exactly one tab stop size.

[+] pbiggar|9 years ago|reply
For gods sake man, how many spaces?
[+] gnode|9 years ago|reply
I find it interesting that C++ using the .cc extension has about 7% tabs, whereas C++ using the .cpp extension has about 36% tabs.

I wonder what else is different about these two groups.

[+] brandmeyer|9 years ago|reply
Well, one correlation is that Google's coding style for C++ uses spaces and the .cc extension. It stands to reason that some fraction of Google alumni would continue to use both practices in their own work. I don't know if its enough to push a 36% tab usage to 7%, but might account for a solid fraction thereof.
[+] koyote|9 years ago|reply
Probably IDE defaults (i.e. there might be an IDE that defaults to .cpp extension AND tabs).
[+] scraft|9 years ago|reply
I've worked for 10+ years in the games industry, ranging from 10 a man 'indie' studio to 400 people (large for the games sector) studios. I am struggling to think of any project I have worked on that hasn't used tabs (including a huge Python project). This maybe a completely irrelevant data point, but I wondered if there is a chance the games industry has a natural preference for tabs?
[+] d33|9 years ago|reply
Am I the only one that finds DATA interesting here? Has anyone actually posted the terabyte over BitTorrent or something so I could play with it while avoiding the "Don’t analyze the main [bigquery-public-data:github_repos.contents] table — at 1.5 TB, it will instantly consume your monthly free terabyte."?
[+] milansuk|9 years ago|reply
I would like to know If they are people who have 'switched'(coded with spaces and then use tabs or revers)?
[+] louthy|9 years ago|reply
I have, I mentioned it in my comment above [1]. It's not too long, so I'll paste it again:

"I came to this conclusion too after at least 20 years of using tabs; but mostly it was because of moving to white-space significant languages like Haskell, F#, etc. It was just too painful getting everything lined up with tabs and I needed the fidelity of spaces.

Previous to that with C/C++/C# the alignment usually took care of itself through the closing brace and IDE auto-formatting. So tabs was the natural unit of currency there."

https://news.ycombinator.com/item?id=12398432

[+] jpfed|9 years ago|reply
I have switched from tabs to spaces. I have always thought that tabs were the more straightforward representation- if we're in the spirit of separating presentation from semantics, we might as well use the character that means "indent this thing". But the majority of people think otherwise, and that's probably never going to change.

In unrelated news, I like the Dymaxion projection and think that tau makes more sense than pi, but every last one of my precious ships has sailed. tear

[+] RutZap|9 years ago|reply
I just recently switched from tabs to spaces. I always used spaces for alignment and tabs for indentation. It was working very well within my company as all my colleagues were using 2sp tabs and I was on 4 space tabs. But now, we've decided to upgrade our codebase and make it PSR compliant (PHP).. which meant switching from tabs to spaces.

To be honest.. it's no biggie after a few weeks you get used to it.

[+] lakkal|9 years ago|reply
Back in the very early 90s, I went through my company's codebase and replaced the mixed indentation with tabs. This was purely to save a space (small company, limited disk space). I probably preferred tabs at the time. Some time in the late 90s, though, I started to prefer the predictab ility and granularity of spaces.
[+] collyw|9 years ago|reply
I used to use tabs until I opened someone else's files that had a mix and were a mess of indentation.