Mine was an automated file transfer system that had to be 100% reliable on an insanely unreliable network (~95% uptime). Took about 9 months of bug squashing after development was done. So many edge cases. I would probably never mention this in a job interview because I doubt most people would understand why it was so hard.
Reminds me of a project from two decades ago which also had a somewhat tricky part.
We needed to do a nightly transfer of data. We had a variable amount of data to transfer, but typically in the range of one to two TB. We had a 1GBit link between the data centres housing the two systems, but it wasn't an exclusive link - backups and other stuff would be running during the night as well, so if we hog all bandwidth we'd have to deal with unhappy people. Hard deadline for the transfer was start of the work day.
Now the data does compress easily - but the data is only available for compression at the beginning of our sync window. We definitely need to compress some of the data to sync everything in time, and keep other users of the line happy. But: If we spend too much time on the compressing we might not have enough time left to send the data, plus we're not alone on the systems - other people will be unhappy about their nightly jobs failing if we hog all the available CPU time.
So we needed to find the right balance of data compression and bandwidth utilisation, taking into account all those factors, to make things work in the amount of time we had available.
Thanks to AMD nowadays we'd just throw more CPUs at the problem, but back then the 8 CPU server we were using was already quite expensive.
I once wrote an inliner. When you have not done it, it seems simple. When you are doing it it is like trying to restrain a large rabid dog with a slippery leash.
Now, I am not a programmer by trade, but I have a hard time thinking anyone would find it nice to write an inliner. At least not if you want the inliner to always make things faster.
About the worst job on any enterprise software project is the PDF output, they always end up doing it for emails or something else and its a never ending list of bugs. Text formatting is a never ending list of problems since its so got a lot of vague inputs and a relatively strict output. Far too many little details go wrong.
With PDF, my best approach was to go very low level. I've used PDFKit and PDFBox libraries and both provide a way to output vector operations. It allows to implement extremely performant code. The resulting PDF is tiny and looks gorgeous (because it's vector). And you can implement anything. Code will be verbose, but it's worth it.
I even think that it's viable to output PDF without any libraries. I've investigated that format a bit and it doesn't seem too complicated, at least for relatively dumb output.
We've spent twenty years working on HTML to PDF conversion and I expect we could easily spend another twenty years, so feel free to give Prince a try if you would rather avoid the headache :)
> That handful of code took me almost a year to write.
Formatting can be tough. See Knuth's extensive bug list for TEX from 1987 at https://yurichev.com/mirrors/knuth1989.pdf to see the kind of tarpit one can get trapped in.
In my brief experiments with Flutter, I must admit I didn't enjoy the experience of using the autoformatting. Not knocking the author of the tool at all, I can definitely see how absurdly hard it is to create something that does what it is trying to do. And I'm not against autoformatting in general either. I think gofmt works much better, and that's in large part because it tries to do less.
With dart, I felt that very often, when time I saved a file after editing (which activated the formatter), the code would jump around a lot, even for very small edits sometimes. I actually found myself saving less often, as I found the sudden reorganizing kind of jarring and disorienting.
Most of this, I felt like came from this wish of making the formatter keep lines inside a given width. While that's a goal I appreciate, I've come to think is one of those things that's better done manually. The same goes for the use of whitespace in general, other than trivial stuff like consistent indentation. There's actual, important meaning that can be conveyed in how you arrange things which I think is more important than having it always be exactly mathematically consistent.
It's one of the reasons I still prefer ESLint over Prettier in JS land, also, even for stylistic rules. The fact that Prettier always rewrites the entire file from scratch with the parsed AST often ends up mangling some deliberate way that I'd arrange the code.
One of the lessons I took from the formatters (Python, Go, Rust) is that enforcing the same style ends all of the drama - indeed, all of the thinking - about how to format code. I like that.
I run my formatters manually, so I can’t comment on the jumps in code. That does seem jarring.
> Note that “best” is a property of the entire statement being formatted. A line break changes the indentation of the remainder of the statement, which in turn affects which other line breaks are needed.
I really don't like formatters where changing one small part of a large expression results in the entire expression formatted very differently. It's simply not version control friendly. Especially if the language encourages large statements like the pro example. I would rather accept a little bit of code ugliness in this case. Sure this then means that the way the code is formatted is path dependent (depends on the history of the code), but I think it's a reasonable compromise.
For some time I tried to write a formatting tool for my programming language. After achieving first results I gave up. I have found that writing a formatter is surprisingly hard task. Operating on token-level can't provide good enough result, so a proper parser is necessary. Reusing existing parser isn't possible, since it ignores whitespaces and comments. Even more problems creates the necessity to preserve user-defined formatting in ambiguous cases.
So I thought, that there is no reason trying to solve all these problems, since it requires too much time investments. But without solving all this a semi-working formatter isn't good enough to be useful and not annoying.
Its funny when you have programmed long enough to see the same language/app names recycled. Two examples that come to mind: Dart, which was an RTL validation scripting language in the 80's/90's used by CPU designers, and Elm, which was a mail program in Unix/Aix/SunOS in the 80's/90's. Even weirder, when googling for the "old" Dart, it referred to RTL as right-to-left, and not register transfer level.
Most of the hardness of this program comes from a user-adjustable line length limit. The author on multiple occasions used a "blog-friendly 40-char line limit" which is of course insanely hard. The program can be made much much simpler if the line length is unlimited. And if that's not an option, the program can be made somewhat simpler by only allowing line limits that are reasonably large, such as 100 characters.
I once wrote a formatter for powerquery that's still in use today. It's a much simpler language and I took a simpler approach. It was a really fun problem to solve.
One can talk about about the technical side of writing a code formatter, but what about the ethical side? Automatically formatted code looks kind of okay but never great. Uniformity for the sake of uniformity. It is not a very humanistic.
There are 3 major problems with automated code formatters:
* Handling of multiple newlines to break sections, or none to group related functions (e.g. a getter with a setter). Sometimes it's even best to move functions around for better grouping.
* They don't factor out an expression into a separate variable.
* They destroy `git blame`. This one is avoided if the tooling has always enforced formatting.
Interesting comment. I always saw the formatting aspect as the sort of drudgery the computer could do for me. (As a bonus, it will always do it completely consistently.)
yeah, I've come to terms that I mostly do programming-as-an-art and that includes how my code is structured, and I'm on exactly the same page.
In pragmatic business environments it's not worth the fuss but I never feel great about anything I make in those kinds of environments anyways, and I always appreciate being able to shine when there's no enforced code formatting.
"Ethics" is overdramatizing it. The goal of a code formatter is not greatness, but adequacy, in a context where the code is a means to an end. They're particularly used in contexts where you may be sharing the project with people who don't care about formatting at all. Forcing me to work in or clean up the messes of my lazy co-workers is also, I would suggest, not very humanistic.
I agree, actually. Humans are visual creatures and they take cues from the visual design of program code: alignment, grouping, density. All these things are used to signify meaning in product design, including digital design. But we are not allowed to "design" source code.
semantic code style varies plenty between people; i can often tell who wrote which code, among the members of my research group, even though we format our code. i'd prefer to be recognized by my preference for short lambdas (vs. defs) and partial functions, than my preference to put if statements on one line.
georgeburdell|3 months ago
finaard|3 months ago
We needed to do a nightly transfer of data. We had a variable amount of data to transfer, but typically in the range of one to two TB. We had a 1GBit link between the data centres housing the two systems, but it wasn't an exclusive link - backups and other stuff would be running during the night as well, so if we hog all bandwidth we'd have to deal with unhappy people. Hard deadline for the transfer was start of the work day.
Now the data does compress easily - but the data is only available for compression at the beginning of our sync window. We definitely need to compress some of the data to sync everything in time, and keep other users of the line happy. But: If we spend too much time on the compressing we might not have enough time left to send the data, plus we're not alone on the systems - other people will be unhappy about their nightly jobs failing if we hog all the available CPU time.
So we needed to find the right balance of data compression and bandwidth utilisation, taking into account all those factors, to make things work in the amount of time we had available.
Thanks to AMD nowadays we'd just throw more CPUs at the problem, but back then the 8 CPU server we were using was already quite expensive.
bjoli|3 months ago
Now, I am not a programmer by trade, but I have a hard time thinking anyone would find it nice to write an inliner. At least not if you want the inliner to always make things faster.
throwaway2037|3 months ago
Did you ever blog about this program? It sounds very interesting, and there is no job interview on HN!
greazy|3 months ago
PaulKeeble|4 months ago
vbezhenar|4 months ago
I even think that it's viable to output PDF without any libraries. I've investigated that format a bit and it doesn't seem too complicated, at least for relatively dumb output.
mikeday|4 months ago
kbbgl87|4 months ago
huflungdung|4 months ago
[deleted]
sema4hacker|4 months ago
Formatting can be tough. See Knuth's extensive bug list for TEX from 1987 at https://yurichev.com/mirrors/knuth1989.pdf to see the kind of tarpit one can get trapped in.
rafabulsing|4 months ago
With dart, I felt that very often, when time I saved a file after editing (which activated the formatter), the code would jump around a lot, even for very small edits sometimes. I actually found myself saving less often, as I found the sudden reorganizing kind of jarring and disorienting.
Most of this, I felt like came from this wish of making the formatter keep lines inside a given width. While that's a goal I appreciate, I've come to think is one of those things that's better done manually. The same goes for the use of whitespace in general, other than trivial stuff like consistent indentation. There's actual, important meaning that can be conveyed in how you arrange things which I think is more important than having it always be exactly mathematically consistent.
It's one of the reasons I still prefer ESLint over Prettier in JS land, also, even for stylistic rules. The fact that Prettier always rewrites the entire file from scratch with the parsed AST often ends up mangling some deliberate way that I'd arrange the code.
fn-mote|3 months ago
I run my formatters manually, so I can’t comment on the jumps in code. That does seem jarring.
kccqzy|3 months ago
I really don't like formatters where changing one small part of a large expression results in the entire expression formatted very differently. It's simply not version control friendly. Especially if the language encourages large statements like the pro example. I would rather accept a little bit of code ugliness in this case. Sure this then means that the way the code is formatted is path dependent (depends on the history of the code), but I think it's a reasonable compromise.
Panzerschrek|3 months ago
So I thought, that there is no reason trying to solve all these problems, since it requires too much time investments. But without solving all this a semi-working formatter isn't good enough to be useful and not annoying.
skopje|3 months ago
kccqzy|3 months ago
b4ckup|4 months ago
juangacovas|3 months ago
mmaunder|4 months ago
[deleted]
jibal|4 months ago
cjfd|4 months ago
chaps|4 months ago
o11c|4 months ago
* Handling of multiple newlines to break sections, or none to group related functions (e.g. a getter with a setter). Sometimes it's even best to move functions around for better grouping.
* They don't factor out an expression into a separate variable.
* They destroy `git blame`. This one is avoided if the tooling has always enforced formatting.
tom_|4 months ago
mjsir911|4 months ago
In pragmatic business environments it's not worth the fuss but I never feel great about anything I make in those kinds of environments anyways, and I always appreciate being able to shine when there's no enforced code formatting.
nkrisc|4 months ago
andrewflnr|4 months ago
Feel free to not use one in your art projects.
jbreckmckye|4 months ago
mackeye|4 months ago
dredmorbius|4 months ago
Both are branches of philosophy, but rather distinct from one another.
bartekpacia|4 months ago
imho uniformity of what the code looks like > some single person's opinion
it's so satisfying to me when I just run "gofmt" and know the thing is formatted well.
wiseowise|4 months ago
jibal|4 months ago
The uniformity is not merely for the sake of uniformity.