top | item 40528380

(no title)

ebolyen | 1 year ago

I don't know, that sounds like a complex kind of ingest which could be arbitrarily subtle and diverge over time for legal and bureaucratic reasons.

I would kind of appreciate having two formats, since what are the odds they would change together? While there may never be a 3rd format, a DRY importer would imply that the source generating the data is also DRY.

discuss

order

Stratoscope|1 year ago

Good point. This may be a case where domain knowledge is helpful.

One of the reasons they brought me in on this project is that besides knowing how to wrangle data, I'm also an experienced pilot. So I had a good intuitive sense of the meaning and purpose of the data.

The part of the data that was identical is the description of the airspace boundaries. Pilots will recognize this as the famous "upside down wedding cake". But it's not just simple circles like a wedding cake. There are all kinds of cutouts and special cases.

Stuff like "From point A, draw an arc to point B with its center at point C. Then track the centerline of the San Seriffe River using the following list of points. Finally, from point D draw a straight line back to point A."

The FAA would be very reluctant to change this, for at least two reasons:

1. Who will provide us the budget to make these changes?

2. Who will take the heat when we break every client of this data?

ebolyen|1 year ago

I see, so it's a procedural language that is well understood by those who fly (not just some semi-structured data or ontology). This is a great example of the advantage of domain experience. Thanks for sharing!

hamasho|1 year ago

I think if you know the domain well, it's not "premature" at all.

TeMPOraL|1 year ago

In such case I think I'd go for an internal-DRYing + copy-on-write approach. That is, two identical classes or entry points, one for each format; internally, they'd share all the common code. Over time, if something changes in one format but not the other, that piece of code gets duplicated and then changed, so the other format retains the original code, which it now owns.

chii|1 year ago

I believe this very method is very common in games - you have similar logic for entities, but some have divergences that could occur in unknown ways after playtesting or future development.

Tho if done haphazardly by someone inexperienced, you might end up with subtle divergences that might look like they're meant to be copies, and debugging them in the future by another developer (without the history or knowledge) can get hard.

Then someone would wonder why there are these two very similar pieces of code, and mistakenly try to DRY it in the hopes of improving it, causing subtle mistakes to get introduced...

chipdart|1 year ago

> In such case I think I'd go for an internal-DRYing + copy-on-write approach.

I agree. The primary risk of presented by DRY is tight coupling code which only bears similarities at a surface level. Starting off by explicitly keeping the externa bits separate sounds like a good way to avoid the worst tradeoff.

Nevertheless I still prefer the Write Everything Twice (WET) principle, which means mostly the same thing, but following a clear guideline: postpone all de-duplication efforts until it's either obvious there's shared code (semantics and implementation) in >2 occurrences, and always start by treating separate cases as independent cases.

jeremyjh|1 year ago

This is really good advice and a great way to think about it.

ebolyen|1 year ago

I like that approach.

bcrosby95|1 year ago

I don't know. I've seen this approach for projects before go bad - people didn't want to DRY because they might diverge. Except they never did. Our 3rd+ scenarios we abstracted.

But what basically ended up happening was we had 2 codebases: 1 for that non-DRY version, and then 1 for everything else. The non-DRY version limped along and no one ever wanted to work on it. The ways it did things were never updated. It was rarely improved. It was kinda left to rot.

chipdart|1 year ago

> But what basically ended up happening was we had 2 codebases: 1 for that non-DRY version, and then 1 for everything else. The non-DRY version limped along and no one ever wanted to work on it. The ways it did things were never updated. It was rarely improved. It was kinda left to rot.

It sounds to me that you're trying to pin the blame of failing to maintain software on not following DRY, which makes no sense to me.

Advocating against mindlessly following DRY is not the same as advocating for not maintaining your software. Also, DRY does not magically earn you extra maintenance credits. In fact, it sounds to me that the bit of the code you called DRY ended up being easier to maintain because it wasn't forced to pile on abstractions needed to support the non-DRY code. If it was easy, you'd already have done it and you wouldn't be complaining about the special-purpose code you kept separated.

jononor|1 year ago

Why wasn't the original implementation swapped for the new one? The unwillingness/inability to do that seems to be most likely the core of the issues here?

kiitos|1 year ago

I have seen far more problems caused by premature or over-use of DRY than the opposite.

Undoing an abstraction is always far more difficult than the pains caused by not having one in the first place.

mannykannot|1 year ago

I vaguely recall Fred Brooks, in The Mythical Man-Month, using a somewhat similar situation, but involving the various US states' income tax rules and their relationship to the federal tax, as an example in order to make some sort of point (that point being 'know your data', IIRC.)

In a situation where there is a base model with specific modifications - which is, I feel, how airspace regulation mostly works - then I suspect that a DRY approach would make it easier to inspect and test, so long as it stays that way.