Ask HN: What are the “best” codebases that you've encountered?
392 points| 0mbre | 6 years ago
While I am asking myself this question, the only one that popups to my mind would be Laravel: https://github.com/laravel/laravel (PHP)
One could think that a codebase as popular as React (https://github.com/facebook/react) would be a perfect example of "clean code" but with a glance, I personally don't find it very expressive.
This may all be very subjective but I would love to see examples of codebases that member of this community have enjoyed working with
[+] [-] tschwimmer|6 years ago|reply
Anyone have an example of a consumer application that has a good codebase? Chromium, GitLab, OpenOffice, etc? I feel like such applications inherently have more spaghetti because the human problems they're aiming to solve are less concretly scoped. Even something as simple as "Take the data from this form and send it to the project manager" ends up being insanely complex and nitpicky. In what format should the data be sent? How do we know who the project manager is? Via what format should the data be sent? How should we notify the project manager? When should we send the report? Some of these decisions are inherently inelegant, so I feel like you get inelegant code.
[+] [-] slondr|6 years ago|reply
[+] [-] westoncb|6 years ago|reply
Also curious about some good, not too hugely sized, game code (preferably something not written in C/C++, maybe like an indie game from the past decade or so). Anyone know something?
[+] [-] aidos|6 years ago|reply
But yeah, Postgres also gets my vote. I guess there’s a bit of a bias there because devs are likely to read the code of the tools they use; either to track downs bug or just to understand how it works.
[+] [-] sharkjacobs|6 years ago|reply
https://github.com/brentsimmons/NetNewsWire
[+] [-] christophilus|6 years ago|reply
[+] [-] bamboozled|6 years ago|reply
[+] [-] nstart|6 years ago|reply
Just to drive the point home, I was developing in Python and with no knowledge of Ruby, I was able to go through code using just github and I got what I wanted every single time.
[+] [-] wp381640|6 years ago|reply
[+] [-] laumars|6 years ago|reply
[+] [-] blueprint|6 years ago|reply
[+] [-] lewaldman|6 years ago|reply
[+] [-] hugofirth|6 years ago|reply
[1]: https://github.com/postgres/postgres
[+] [-] jakewins|6 years ago|reply
sqlite is much less complex, but similarly approachable.
In more recent examples, I think you see a lot of this same reader-centric pragmatic ethos in many Go projects. The Kubernetes codebase comes to mind as a very large tome that remains approachable. And the Go stdlib, of course.
Java generally falls on the opposite side, but there are counterexamples. A lot of Martin Thompsons code eschews Java "best practices" in favor of good code. Seeing competent people in the Java space "break the rules" helps.. though of course Java is forever hampered by having internalized illegible patterns as best practices in the first place.
It's a shame because at least the OpenJDK implementation of the standard library in Java is generally quite good, especially around the concurrency parts. Clean, easy to follow, reasonable comments. But of course that's Java written by C developers, mostly.
[+] [-] unknown|6 years ago|reply
[deleted]
[+] [-] rurban|6 years ago|reply
A special mingw tool to create importlibs is/was broken on 64bit. I think it was called dlltool. Normally you'll just need to add a flag to the linker to create that.
So no, postgresql not.
[+] [-] ekidd|6 years ago|reply
xsv: https://github.com/BurntSushi/xsv
ripgrep: https://github.com/BurntSushi/ripgrep
His code typically has extensive tests, helpful comments, and logical structure. It was fun trying to imitate his style when writing a PR for xsv.
The Quake 2 engine was also pretty interesting: It was almost totally undocumented, and it had plenty of weird things going on. But I could count on the weird things being there for a reason, if only I thought about it long enough.
[+] [-] nindalf|6 years ago|reply
[+] [-] mehrdadn|6 years ago|reply
[+] [-] bitwize|6 years ago|reply
Also -- the source code to Doom. Read it, marvel at its clarity and efficiency -- and then laugh when you realize that the recent console ports were completely rewritten in fucking Unity. And the Switch version chugs, despite the original running well on 486-class hardware.
[+] [-] Wowfunhappy|6 years ago|reply
I wonder why they didn't just write an emulator, then. Especially on the Switch if there are performance issues.
[+] [-] cameronbrown|6 years ago|reply
[+] [-] hermitdev|6 years ago|reply
I found the source very approachable. Source was well laid out and fairly clear. Some of it was subjectively a bit ugly to just look at, but when you read it, it was very clear.
Couldn't use glibc as a reference because this in a closed source commercial product and, well, GPL.
[+] [-] ashafer|6 years ago|reply
[+] [-] jihadjihad|6 years ago|reply
For Python, I really like how SQLAlchemy is written and designed.
For Rust, ripgrep stands out as a sterling example of how to write a powerful low-level utility like that.
[+] [-] rsweeney21|6 years ago|reply
Windows is quite an engineering achievement. We didn't prioritize readability or "clean code". All the variables used hungarian notation, so you had horrible names like lpszFileName (lpsz = long pointer to a zero terminated string) or hwndSaveButton (window handle). You also had super long if(SUCCEEDED(hr)) chains that looked like your code was spilling down a staircase. Oh yeah, and pidls (pronounced "piddles" and short for "pointer to an id list") used for file operations.
What made the code base beautiful was the extreme lengths we went to to be fast and keep 3rd party software working. WndProcs seem clunky, but they are elegant in their own way and blazingly fast. All throughout the code base you would find stuff like "If application = Corel Draw, don't actually free the memory for this window handle because Corel uses it after sending a WM_DESTROY message."
The fact that thousands of people worked on the code base was mind boggling.
[+] [-] inlined|6 years ago|reply
1. I think I counted 5 string implementations in active use and code at the boundary had to convert between them all.
2. The SUCCEEDED macro is a mask against HRESULT but who the hell actually uses non-zero HRESULTS to communicate domain-specific success codes? And don’t forget that posix APIs return 0-for-non-error ints and COM APIs can use S_TRUE (0 to be a non-error) and S_FALSE (1) so you have to flip them for real bools. Or have if (bResult == S_TRUE)
3. Nobody wanted to touch old codebases. I fixed an assert in Trident layout code because a whole library used upper-left, lower-right input (and params called ul, lr) but one function (contrary to docs) used upper-left, width, & height. When I fixed the library and 2/3 call sites I was called arrogant, to revert changes in the library, and change the last 1/3 to also have the inverse bug in its call-site.
4. Another Trident API (written by an intern) had a tree where fastInsert() could only be called after slowLookup() but nothing in the api enforces this
5. Every COM object decides whether it’s faster or thread-safe by whether the refcount uses atomic ops or just —/++
6. Saw parallel arrays in files where a struct held an object which might have suffered the slicing problem in insert. Another struct field held an into the index of the sliced part array. Users rehydrated. This wouldn’t happen with an object pointer, but indirection was unacceptable because the author didn’t trust the small allocation heap’s locality.
7. My codebase included a while c++ runtime because my core-OS team didn’t trust msvcrt.dll because the shell team wrote it.
[+] [-] breck|6 years ago|reply
[+] [-] criddell|6 years ago|reply
[+] [-] tempguy9999|6 years ago|reply
> ...like lpszFileName (lpsz = long pointer to a zero terminated string)
I remember those.
AIUI hungarian gives you some kind of typing. The typing is done by humans using the names. The humans have to get it right; they are the typecheckers.
The first thing I'd do is offload the typechecking onto an automatic framework - the idea of letting people do a computer's job is madness. It would not have been too hard to do (relatively very cheap for a large codebase like an OS), I think, and would have allowed the hungarian prefixes to be dropped because they'd become redundant, and strengthened and speeded up typechecking. So where is the flaw in my thinking?
(aside: one of my first contract jobs was working in pascal (delphi actually). The company I worked for had coding standards cos you need standards, don't you. It was to prefix every integer with i_, every float with f_, every int array with ai_, et cetera. As pascal was strongly typed this was totally pointless).
[+] [-] cryptica|6 years ago|reply
[+] [-] sea6ear|6 years ago|reply
I liked that it was actually possible to read it and understand what was going on.
In a similar vein, P. J. Plauger's version of the The Standard C Library is nice because even if it might not be especially optimized(?), you can actually read the code and understand the concepts that the standard library is based on.
Software Tools by Kernighan and Plauger would also be great except that you have to translate from the RatFor dialect of Fortran or Pascal to use the code examples.
Even so, I used its implementation of Ed, to create a partial clone in PowerShell that let me do remote file editing on Windows via Powershell when that was the only access that was available.
So even over 4 decades and various operating systems removed, there are still concepts in there that are useful.
Jonesforth is also a great and mind blowing code base although I'm not sure where the canonical repository is currently.
[+] [-] quadcore|6 years ago|reply
I think a common misconception amongst mid-experienced programmers is that they confuse look with quality. Reading clean written code gives you a feeling of control and also the feeling that someone must have thought about that program. It's reassuring. You have in front of you a code that gives you trust.
When in fact, that code can be complete garbage.
The look of the code doesn't matter, what matters is the program. In the abstract meaning of the term. You don't judge a code by reading it, but by running it in your head. Granted you have to understand it in order to do that. Once you understand the code, you run it in your head and that's when quality enter the scene because running it in your head is what you do all day when you code. Some says that you spend most of your time reading code. That's simply not true, the effort is definitely not in reading but in running the code in your head. Basically what I'm describing is a 2 by 2 matrix where there is one column for look bad, one for look good, one row for runs badly in the head and one for run smoothly in the head. Granted, the best may be when both the code looks right and runs right, but don't be mistaken, the real important and difficult part is whether or not it runs well in the head.
A poor quality program may look good, but don't run well in the head. It's too complex or too confusing (in terms of logic, not in terms of presentation) or convoluted or simply wrong in terms of what it's supposed to do. On the other hand good quality code is code that surprises you by the way it runs. It's beautiful in terms of simplicity, it delivers a lot, it's small so that it fits well in the coder's head. And it may look like garbage which is not so important.
You may wonder how to know very quickly the quality of a code base. Run part of it in your head. Contemplate the machinery. Try not to think to much about the language and how it's constructed in this language, try instead to contemplate it in an abstract manner. Be critic, and critic your critics.
[+] [-] omarhaneef|6 years ago|reply
In github, rather than see what has changed, it would be interesting if there was a comment that told you what the folder contained.
edit: Relevant here because the best codebase for me is one where I can understand the folder structure, but that is a sort of 0th order effect that should be equalized with some tool.
[+] [-] cryptica|6 years ago|reply
It takes a lot of investment from a developer before they can appreciate the beauty of the code... To make matters more confusing, a lot of developers tend to become extremely attached to even horrible code if they spend enough time working with it; it must be some kind of Stockholm syndrome.
I think the problem is partly caused by a lack of diversity in experience; if a developer hasn't worked on enough different kinds of companies and projects, their understanding of coding is limited to a very narrow spectrum. They cannot judge if code is good or bad because they don't have clear values or philosophy to draw from to make such judgements. If you can't even separate what is important from what is not important, then you are not qualified to judge code quality.
If you think that the quality of a project is determined mostly by the use of static vs dynamic types, the kind of programming paradigm (e.g. FP vs OOP), the amount of unit test coverage and code linting, then you are not qualified to judge code quality.
I think that the best metric for code/project quality is simply how much time and effort it takes for a newcomer to be able to start making quality contributions to the project. This metric also tends to correlate with robustness/reliability of the code and also test quality (e.g. the tests make sense and they help newcomers to quickly adapt to the project).
As developers, we are familiar with very few projects. If a developer says that they like React or VueJS or Angular, etc... they usually have such limited view of the whole ecosystem that their opinion is essentially worthless; and that's why no one ever seems to agree about anything. We are all constantly dumbing down everything to the lowest common denominator and regurgitating hype. Hype defies all reason.
It's the same with developers; most developers (especially junior and mid-level) are incapable of telling who is actually a good developer until they've worked with them for about 6 months to a year.
If you are not a good developer, you will not be able to accurately judge/rank someone who is better than you at coding until several months or years of working with them. Sometimes it can take several years after you've left the company to fully realize just how good they were.
[+] [-] oaxacaoaxaca|6 years ago|reply
[+] [-] bijection|6 years ago|reply
Tellingly, Marijn Haverbeke, Codemirror's creator, is also the author of the excellent 'Eloquent Javascript' [1].
[0] https://github.com/codemirror/codemirror
[1] http://eloquentjavascript.net/
[+] [-] luminati|6 years ago|reply
[1] Interesting Codebases: https://news.ycombinator.com/item?id=15371597
[2] Show HN: Awesome-code-reading - A curated list of high-quality codebases to read https://news.ycombinator.com/item?id=18293159
[+] [-] hartator|6 years ago|reply
[+] [-] kostarelo|6 years ago|reply
Small summary of the features I liked:
- Simple documentation
- Intuitive structure
- Lots of JS best practices, but still simple
- Event-driven architecture
- A simple API gateway that will just fire events to workers
- Properly divided workers (kind of microservices but with lots of shared code)
- Monorepo
It recently been bought by GitHub(1) and was discussed here(2).
The author has talked in his blog about some decisions he took wrong. Super interesting post(2).
0. https://github.com/withspectrum/spectrum
1.https://spectrum.chat/spectrum/general/spectrum-is-joining-g...
2. https://news.ycombinator.com/item?id=18570598
3. https://mxstbr.com/thoughts/tech-choice-regrets-at-spectrum/
[+] [-] GoMonad|6 years ago|reply
[+] [-] Congeec|6 years ago|reply
https://github.com/tornadoweb/tornado/blob/master/tornado/io...
[+] [-] vecplane|6 years ago|reply
[+] [-] SkyMarshal|6 years ago|reply
https://norvig.com/sudoku.html
[+] [-] jamierumbelow|6 years ago|reply