top | item 36801501

(no title)

TheCapn | 2 years ago

Huh. Finally a name for it.

I do a lot of support work for Control Systems. It isn't unheard to find a chunk of PLC code that treats some sort of physical equipment in a unique way that unintentionally creates problems. I like to parrot a line I heard elsewhere: "Every time Software is used to fix a [Electrical/Mechanical] problem, a Gremlin is born".

But often enough when I find a root cause of a bug, or some sort of programmed limitation, the client wants removed. I always refuse until I can find out why that code exists. Nobody puts code in there for no reason, so I need to know why we have a timer, or an override in the first place. Often the answer is the problem it was solving no longer exists, and that's excellent, but for all the times where that code was put there to prevent something from happening and the client had a bunch of staff turnover, the original purpose is lost. Without documentation telling me why it was done that way I'm very cautious to immediately undo someone else's work.

I suppose the other aspect is knowing that I trust my coworkers. They don't (typically) do something for no good reason. If it is in there, it is there for a purpose and I must trust my coworkers to have done their due diligence in the first place. If that trust breaks down then everything becomes more difficult to make decisions on.

discuss

order

bamfly|2 years ago

This is why I comment a "why" for any line of code that's not incredibly obvious. And 100% of the time when it's due to interaction with something outside the codebase, whether that's an OS, filesystem, database, HTTP endpoint, hardware, whatever, if it's not some straightforward call to some API or library.

Sleep due to rate limiting from another service? COMMENT. Who's requiring it, the limits if I know exactly what they are at the time (and noting that I do not and this is just an educated guess that seems to work, if not), what the system behavior might look like if it's exceeded (if I know). Using a database for something trivial the filesystem could plausibly do, but in-fact cannot for very good reasons (say, your only ergonomic way to access the FS the way that you need to, in that environment, results in resource exhaustion via runaway syscalls under load)? Comment. Workaround for a bug in some widely-used library that Ubuntu inexplicably refuses to fix in their LTS release? Comment. That kind of thing.

I have written so very many "yes, I know this sucks, but here's why..." comments.

I also do it when I write code that I know won't do well at larger scale, but can't be bothered to make it more scalable just then, and it doesn't need to be under current expectations (which, 99% of the time, ends up being fine indefinitely). But that may be more about protecting my ego. :-) "Yes, I know this is reading the whole file into memory, but since this is just a batch-job program with an infrequent and predictable invocation and this file is expected to be smallish... whatever. If you're running out of memory, maybe start debugging here. If you're turning this into something invoked on-demand, maybe rewrite this." At least they know I knew it'd break, LOL.

MrBuddyCasino|2 years ago

You're doing the lords work. I often get pushback on doing this with some variation of "comments bad, code should be self-documenting". This is unwise, because there are "what code does" and "why code does" comments, but this turns out to be to nuanced to battle the meme.

gav|2 years ago

I like to make sure the "why" is documented, but it's hard to get people to care about that.

I remember a former client tracking me down to ask about a bug that they had struggled to fix for months. There was a comment that I'd left 10 years earlier saying that while the logic was confusing, there was a good reason it was like that. Another developer had come along and commented out the line of code, leaving a comment saying that it was confusing!

overnight5349|2 years ago

Hah, I did one of these just last week. There's some sort of silicon bug or incorrect documentation that causes this lithium battery charger to read the charge current at half of what it should be. This could cause the battery to literally explode, so I left a big comment with lots of warnings explaining why I'm sending "incorrect" config values to the charge controller.

It's absolutely imperative that the next guy knows what the fuck I'm doing by tampering with safety limits.

jrs235|2 years ago

I like to add the date the why comment was added and the date the comment was last reviewed/veified as still true/neccesary (which will rarely differ because they are seldomly reviewed/re-verified).

COMMENT WRITTEN: 2023-03-21

COMMENT LAST REVIEWED/VERIFIED AS STILL TRUE: 2023-05-04

WHY THIS CODE: This sucks but ...

csours|2 years ago

There was some inscrutable code that had sleep 10 [seconds] in the middle.

It took several hours to figure it out, but the sleep was there in case a file had not finished downloading

jasfi|2 years ago

I like to write frequent comments, so that I can just scan the comments (only) of a function to know how and why it does what it does.

mike_hock|2 years ago

The last part is also useful. Tells the next person where to look when they do run into scaling issues, and also tells them that there wasn't some other reason to do it.

Karellen|2 years ago

> But often enough when I find a root cause of a bug, or some sort of programmed limitation, the client wants removed. I always refuse until I can find out why that code exists. Nobody puts code in there for no reason, so I need to know why we have a timer, or an override in the first place.

Isn't that just the regular Chesterton's Fence argument though?

The one the article is specifically written to point out is not enough by itself, because you need to know what else has been built with the assumption that that code is there?

TheCapn|2 years ago

All my comment is adding a software anecdote to the story. It really is just regular Chesterton's Fence, a term I've never heard until now but dealt with for the last several years.

You're not wrong, but in the context of a PLC controlling a motor or gate it is far more segregated than the code you're probably thinking of. Having a timer override on a single gate's position limit sensor would have no effect on a separate sensor/gate/motor.

If the gate's function block had specific code built into it that affected all gates then what you're talking about would be more applicable.

letitbeirie|2 years ago

The most haunting comment line I've ever seen was buried deep in an Allen Bradley PLC:

> I don't know why this rung is needed but delete it and see what happens for yourself

Did not fuck around; did not find out.

mauvehaus|2 years ago

Context for those who haven't worked in the field: A PLC is a programmable logic controller. They are typically programmed with ladder logic which grew out of discrete relay based control systems.

Generally they're controlling industrial equipment of some sort, and making changes without a thorough understanding of what's happening now and how your change will affect the equipment and process is frowned upon.

https://en.wikipedia.org/wiki/Ladder_logic

sumtechguy|2 years ago

but now its like the number of licks to the center of tootsie pop the world may never know!

weaksauce|2 years ago

do you remember what the instruction was?

ke88y|2 years ago

> I do a lot of support work for Control Systems. It isn't unheard to find a chunk of PLC code that treats some sort of physical equipment in a unique way that unintentionally creates problems. I like to parrot a line I heard elsewhere: "Every time Software to fix a [Electrical/Mechanical] problem, a Gremlin is born".

At least some of this is cultural. EEs and MEs have historically viewed software less seriously than electrical and mechanical systems. As a result, engineering cultures dominated by EEs/MEs tend to produce shit code. Inexcusably incompetent software engineering remains common among ostensibly Professional Engineers.

TheCapn|2 years ago

You're not wrong. It shows in the state of PLC/HMI Development tools. Even simple things like Revision Control is decades behind in some cases.

I've basically found my niche in the industry as a Software Engineer though I can't say I see myself staying in the industry much longer. The amount of time's I've gotten my hands on code published by my EE coworkers only to rewrite it to work 10x faster at half the size with less bugs? Yikes. HMI/PLC work is almost like working in C++ at times, there's so many potential pitfalls for people that don't really understand the entire system, but the mentality by EE/ME types in the industry is to treat the software as a second class citizen.

Even the clients treat their IT/OT systems that way. A production mill has super strict service intervals with very defined procedures to make sure there is NO downtime to production. But get the very same management team to build a redundant SCADA server? Or even have them schedule regular reboots? God no.

liminalsunset|2 years ago

I think one of the reasons I've seen this happen is because typically, EE and ME programs in university teach very little CS "enough to be dangerous", and the few coding projects you are required to do are often taught in a way that downplays the importance of the software. Software is often seen as simply a translation or manifestation of a classical mathematical model or control system ( or even directly generated by Matlab/Simulink).

Software, being less familiar, is not viewed as a fundamental architectural component because there often isn't sufficient understanding of the structure or nuance involved in building it. In my experiences software or firmware engineers tend to be isolated from the people who designed the physical systems, and a lot of meaning is lost between the two teams because the software side does not understand the limitations and principles of the hardware and the hardware team does not understand the capabilities and limitations of the software.

hef19898|2 years ago

On the other hand there was Juicero...

causi|2 years ago

I'll never do PLC work again. Forget undocumented code, most of the time there's no schematics for the hardware you're working on because it was custom-built thirty years ago.

TheCapn|2 years ago

My company is generally good about that. We have lots of overlapping documentation that answers questions like that in different ways. From Electrical schemas to QA docs, picture archives of panels and wiring, ticketing systems, spreadsheets over I/O, etc. etc.

I hate PLC work for other reasons. I'm starting to look at going back to more traditional software role. I'm a bit tired of the road work and find the compensation for the amount asked of you to be drastically underwhelming. This meme is very much relevant:

https://i.redd.it/rawo5uki1v9b1.jpg

smallpipe|2 years ago

> I suppose the other aspect is knowing that I trust my coworkers. They don't (typically) do something for no good reason

This so much. Depending on the git blame, I'll either remove it blindly or actually think about it way more.

glonq|2 years ago

> "Every time Software is used to fix a [Electrical/Mechanical] problem, a Gremlin is born"

Early in my career, I was confused by seemingly-crazy questions in the Hacker Test (https://www-users.york.ac.uk/~ss44/joke/hacker.htm) like...

> 0133 Ever fix a hardware problem in software?

> 0134 ... Vice versa?

But after spending years developing embedded systems, I don't even blink at such questions. Yes, of course I have committed such necessarily evils!

hinkley|2 years ago

> Nobody puts code in there for no reason, so I need to know why we have a timer, or an override in the first place.

I would like to think that if I sent out an email about git hygiene that you would support me against the people who don’t understand why I get grumpy at them for commits that are fifty times as long as the commit message, and mix four concerns two of which aren’t mentioned at all.

Git history is useless until you need it, and then it’s priceless.

I can’t always tell what I meant by a block of code I wrote two years ago, let alone what you meant by one you wrote five years ago.

scubbo|2 years ago

> commits that are fifty times as long as the commit message

One of my proudest commits had a 1:30 commit:message length ratio. The change may have only been ~3 lines, but boy was there a lot of knowledge represented there!

kvmet|2 years ago

I work with control systems and have a similar mantra: "You can't overcome physics with software." It's super common to have someone ask if a mechanical/human/electrical/process issue can be fixed with software because people believe that programming time is free. Sometimes it's not even that it's impossible to do in software, but adding unnecessary complexity almost always backfires and you'll wind up fixing it the right way anyway in the end.

woleium|2 years ago

Chesterton's Fence is a principle that says change should not be made until the reasoning behind the current state of affairs is understood. It says the rash move, upon coming across a fence, would be to tear it down without understanding why it was put up.

throw9away6|2 years ago

Sometimes you have to remove the chunk and see what breaks because the person that put it in is long gone and documentation is missing

cortesoft|2 years ago

This sounds more like the original Chesterton’s fence than what the article is describing. The article is about understanding something’s actual current purpose, rather than just the intended purpose.

What the article is describing reminds me of the XKCD comic workflow: https://xkcd.com/1172/

A system exists external to the creators original purpose, and can take on purposes that were never intended but naturally evolve. It isn’t enough to say “well that is not in the spec”, because that doesn’t change reality.

dacox|2 years ago

The unspoken thing here is that PLC code often(usually?) isn't exactly written in text, or in a format readable by anything other than the PLC programming software.

After a year long foray into the world of PLC, I felt like I was programming in the dark ages.

I'm assuming its a bit better at very big plants/operations, but still.

alex-robbins|2 years ago

> "Every time Software is used to fix a [Electrical/Mechanical] problem, a Gremlin is born"

I'm definitely going to use this, and I think there's a more general statement: "Every time software is used to fix a problem in a lower layer (which may also be software), a gremlin is born."

xen2xen1|2 years ago

cough 737 Max cough

peteradio|2 years ago

> "Every time Software to fix a [Electrical/Mechanical] problem, a Gremlin is born".

I think I get the gist, but that sentence is missing some words.

TheCapn|2 years ago

oooboy is it. That's what I get for not proofreading.

Izkata|2 years ago

> "Every time Software is used to fix a [Electrical/Mechanical] problem, a Gremlin is born"

Gizmo caca.

(...I just watched both Gremlins movies last weekend...)

HeyLaughingBoy|2 years ago

> "Every time Software is used to fix a [Electrical/Mechanical] problem, a Gremlin is born".

Just did one of those this morning. Hmmm.