top | item 28686016

The code worked differently when the moon was full

447 points| shanselman | 4 years ago |hanselman.com | reply

162 comments

order
[+] vidanay|4 years ago|reply
We once had a customer who would call us in a panic a couple of times a year saying our inspection equipment was experiencing unusually high false rejects and they were generating very high scrap rate. By the time we got a technician on site the next day, everything was working flawlessly and the customer couldn't reproduce the problem either. This went on for almost three years with various levels of escalation to the current management. Finally, one day a technician was on site for another project when the customer came up to him and said "It's happening right now! Come fix it!" The technician rushed over to the equipment and discovered that the sun was shining at exactly the right angle to cause a lens flare in one of our cameras. This happened twice a year as the sun moved along its trajectory. A strategically placed piece of opaque plastic fixed it permanently.
[+] mark-r|4 years ago|reply
I knew a guy who was a computer tech early in his career. One of his rounds was on a military base, and they had just moved their computer room up one floor. They started having problems with their tape drive, it would just randomly pop up an error while they were using it. He tried mightily to diagnose the problem but couldn't figure it out. Finally he took a break and walked to the nearest window and looked out. He saw a radar antenna making a sweep - and realized the error came when the antenna was pointed in their direction.
[+] steve_taylor|4 years ago|reply
Reminds me of the sun outage that affects Indian stock exchanges (BSE and NSE) at certain times of the year. VSATs used by traders experience loss of connectivity to geostationary satellites while they transit the sun.

https://en.wikipedia.org/wiki/Sun_outage

[+] DavidPeiffer|4 years ago|reply
I had a similar issue in my industrial automation class. We were sorting cylinders by diameter and height as they went down the conveyor belt. PLC controlling motors, sensors, etc.

My group got everything setup, built our program, and everything worked fine. Waited a few minutes for the TA to verify, but it failed. We changed a few things, it worked, but failed when he came over.

Another group looked over our code, no issues noticed.

Finally I realized I was standing when we were testing things. I sat down waiting for the TA to verify. My shadow blocked the sun from the photo eye. Wasted half the lab on an issue that was entirely dependent on our position in the room, but found the root cause.

[+] NovemberWhiskey|4 years ago|reply
You get the same problem with geostationary satellites, e.g. for satellite TV.

There's period of about 10 days every spring and fall where, for up to 30 minutes every day, the sun transits 'behind' a satellite within the beamwidth of the dish and totally overwhelms the signal at the LNB.

[+] xsmasher|4 years ago|reply
This is like a real-world equivalent of the "cleaning lady unplugged the machine" urban legend.
[+] tabtab|4 years ago|reply
Our garage door opener has this problem. At very specific times of the year the sun confuses the infrared blockage sensor. The cause occurred to me when I lined up my eye to see what the sensor was seeing when it was failing, and I noticed the morning sun right next to the other end of the sensor stream. I moved a trash can to shade the sensor and it worked fine.
[+] hypertele-Xii|4 years ago|reply
So much time and money wasted by not recording the camera. Could've simply reviewed the footage from the right timestamp and immediately discovered what was wrong. All you had to do was take a still picture every time the system makes a rejection.
[+] actually_a_dog|4 years ago|reply
> The technician rushed over to the equipment and discovered that the sun was shining at exactly the right angle to cause a lens flare in one of our cameras.

Somewhat infamously, "a rare alignment of sunlight on high-altitude clouds above North Dakota and the Molniya orbits of the satellites" the Soviets used for their nuclear attack early warning system triggered a false alarm, which, had it been treated as a real situation, could have lead to nuclear war in the early 80s.

https://en.wikipedia.org/wiki/1983_Soviet_nuclear_false_alar...

[+] karlkloss|4 years ago|reply
This is something that I've also heared from hot axle box detectors for trains. Their solution: plant a tree.
[+] vincnetas|4 years ago|reply
preemptivly storing images from camera of rejected samples would have saved everyone more time, as it would be enough to review images of failed samples and notice a flare. of course if that was an option.
[+] LegitShady|4 years ago|reply
like stonehenge, but with industrial sensors
[+] HelixEndeavor|4 years ago|reply
Indiana Jones discovering the location of the Ark.
[+] mhandley|4 years ago|reply
The phase of the moon really can affect performance. A friend of mine worked on wireless links in Scotland and was struggling with loss at certain times of day, but not exactly the same time every day. When they graphed loss against time, the pattern was really periodic over many days. The periodicity turned out to be 12 hours 25 minutes, which they eventually realized is exactly the time between low tides. The problem was at low tide the reflected path off the water interfered with the line-of-sight path causing signal fading, whereas at high tide it interfered much less. In particular, see figure 2 of their paper for the correlation between tide height and SNR: https://homepages.inf.ed.ac.uk/mmarina/papers/EDI-INF-RR-136... As tide height really does depend on the phase of the moon, presumably their loss did too, if they measured for long enough.
[+] m4rtink|4 years ago|reply
I heard a story about an astronomer loosing the chance to be the first to report a commet one cold winter night - just as he wanted to send the email to report it, the Internet connection was dead! He ran from the observatory to the nearest place with Internet connectivity, but by the time he sent the email from there, there was already a report from another astronomer elsewhere, a few minutes ago.

Reason for the mysterious network outage ? Thermal contraction! The observatory was connected to the Internet via an optical link to a highrise building in the city that contracted ever so slightly due to the very low temperature, moving the laser beam of the optical link out of alignment, shutting down the connection.

[+] ColinWright|4 years ago|reply
For those who are interested, this is why you usually have two dishes/aerials, vertically displaced, so that when one has destructive interference between the direct and reflected signals, the other has constructive interference. I learned something about this when writing data compression and encryption software for radar surveillance systems, where there were multiple radars over a moderate coverage area, all sending data via microwave links over water back to the Command and Control Centre.
[+] eitland|4 years ago|reply
> When they graphed loss against time, the pattern was really periodic over many days. The periodicity turned out to be 12 hours 25 minutes, which they eventually realized is exactly the time between low tides.

I've set up a couple of monitoring systems at a couple of different companies and one thing I've heard some people saying is that they don't care about "fancy graphs", they just want a dashboard of what is red and what is green.

This might be a manager vs engineer perspective, because for me the graphs are the main point: it allows me to spot

- patterns (each night, each weekend, some weekends, more-or-less-randomly-except)

- and also trends: at this speed we are going to reach 80% utilization before November.

[+] abkfenris|4 years ago|reply
I remember reading that paper when I was trying to figure out why we were having issues with a wireless link down in the Patagonia fjords.

Unfortunately we didn't have the hardware or enough control over the link (it took negotiating access with armed forces to work on either end) to try to implement any of their ideas.

[+] pedrocr|4 years ago|reply
Cool result. That figure 2 is begging for a scatter plot of SNR and tide level to see how well correlated they are.
[+] fragmede|4 years ago|reply
The moon is a nightime light source, and a pretty good one at that, every 30 days or so. Even after the invention of the light bulb it continues to light up the night. Thus it's not astrology to suggest that the phase of the Moon could affect things on Earth seeing as how it's what causes tides. (It is astrology to suggest the Moon is causing an effect based on magic though).
[+] walrus01|4 years ago|reply
point to point microwave link sounds like the bottom part of the fresnel zone was scraping the water - bad engineering design from the outset, tides or no tides. Not a good idea to do unless you have absolutely no economical way of getting one or both ends of the link higher.
[+] acdha|4 years ago|reply
I heard a great story a while back for a digitization project where historic content was being provided by many libraries around the world, including one in Russia.

The quality of the scanned books was excellent, except for a weird distortion every so often where part of the page would be shifted partway through as if someone had shifted half the page in Photoshop. This was only noticed in books over a certain size so people were checking to see if there was some kind of mechanical problem with the scanner (these were robots with automatic page turners so it was plausible that there could be something which was only an issue past a certain position), trying to figure out of there was some way that the software had some kind of memory leak or other issue which would explain the long and inconsistent intervals.

Eventually they were on a long-distance phone call to Moscow and not turning up anything when there was a loud rumble in the background. “What was that?” lead to the realization that the library's scan center was close to a subway tunnel. The vibration of a passing train was enough to cause a glitch but only if you happened to be scanning at the exact time it went by: the reason longer books were noticed was simply because having more pages meant that at any point in time a long book was more likely to be sitting in the scanner and the technicians running the scanner were apparently tuning out the trains as background noise. This was reportedly the first project they'd done with one of the scan robots which can process an entire book unattended so it was plausible that smaller past projects simply hadn't been scanning frequently enough to hit this problem or that some previous technician had noticed and immediately redone the page.

[+] asdff|4 years ago|reply
This is why you see sensitive imaging equipment in labs on air tables. Sometimes the entire room is an air table.
[+] ethbr0|4 years ago|reply
I deal with this all the time at work. People are capable of tuning out frighteningly obvious things, if they happen with enough regularity for long enough.
[+] btilly|4 years ago|reply
I thought that this was going to be different story.

There was a program I heard about back in the 90s which would literally crash depending on the phase of the moon!

The story is that it wanted to print a date. The programmer happened to have an astronomy library available that gave a string containing the date. So the programmer called that, and then parsed out the date.

Unfortunately the astronomy library wrote its result as a string to a point. The result included the phase of the Moon. The pointer was not declared to be long enough. And therefore, would crash if the name of the phase of the moon was too long!

[+] vanviegen|4 years ago|reply
This reminds me of something I lived through as a nerdy teenager working a summer job as first line IT support at the headquarters of a multinational, in the mid-90s...

One day, I started receiving calls (through my pager!) from rather many people about intermittent networking problems. The state of the art 10mbit wired UTP network would have frequent bursts of 90% package loss.

What was weird: only people on the fifth floor would have this issue..!? Our first thought was that they were on a single hub/switch that might have broken. But no, they were connected to the same uplinks as the computers on the problem-free surrounding floors. Furthermore, laptop users (who were of course also wired at the time) were reporting no problems whatsoever.

We were pretty much out of ideas by that point, but did an experiment just to test our assumptions: we took a PC and hooked it up with a long network cable and a power extension cable on the fourth floor and started pinging it. Flawless. Then we started walking up the stairs, and, yes indeed, somewhere around halfway up the stairs packets started to drop. (But not at all times, sometimes it would be fine, like all PCs on the fifth.)

If you want to guess at the cause, this is your chance. :-)

We brought in a company specialized in EM interference. It turns out that a GSM antenna placed on the roof of the four story building opposite to ours about half a year ago, had just been turned on. Its height aligned to our fifth flour. Whenever someone was using this mast to make a call (which certainly wasn't all of the time back then), it would cause interference on a specific model of network card that we were using in all of our PCs. It had a relatively large metal component that was apparently a pretty good 900 MHz antenna.

When confronted, the mobile operator quickly adjusted the antenna to not be directed at us. I believe all network cards were replaced soon after. Fun times!

[+] woliveirajr|4 years ago|reply
> Not strictly the cycle of the moon but close.

Meh. Just the old 49.7 days cycle that it takes to overflow 32 bits when measuring miliseconds.

I was hoping for a "it works when I buy vanilla icecream and doesn't when I buy other flavour".

[+] specialist|4 years ago|reply
Yes and:

> Just the old 49.7 days cycle...

I've encountered datetime bugs and learned to take preventative measures.

I generally add a virtual clock shim to my projects, eg wrapping System.currentTimeMillis() or equiv.

Then I write unit tests for anticipated edge cases. Like midnight, end/start of year, etc. To ensure reporting, rollups, logging, grooming, etc. are working correctly.

Also allows me simulate elapsed time, so I verify out of order event processing and so forth.

[+] ciaron|4 years ago|reply
I used to work in aerospace. One of my projects involved running avionics bench tests at a customer facility, basically the avionic subsystem of the aircraft in a big room on shelves. We were using a laptop for data logging and started getting dropouts in the data every 5 minutes. This was worrying because a) this hadn't happened at our site on similar equipment and b) this was a final customer-facing check before doing a real test flight.

We spent about a week trying to debug the system and the software and at a certain point while I was just sitting and thinking about what to do next, Flying Toasters popped up in the data logging PC (the lid was normally closed because of the space on the bench).

The Windows screensaver was hogging so much CPU that the datalogger couldn't keep up.

[+] drdeadringer|4 years ago|reply
Reminds me of a story where a company's internet would regularly drop at the same time every day -- let's say 3pm.

Nobody could figure it out so they called in an expert.

After lots of attempts and figuring, one day the person in question happens to look out the window at the time in question ... and sees a service truck park exactly in line-of-sight between the business and their internet-signal pickup broadcast point.

Ah ha!

[+] macintux|4 years ago|reply
> Bugs based on a time calculations can often show themselves later when view through a longer lens and scope of time...sometimes WAY longer than you'd expect.

When I worked for BBN in '97-'98, someone from outside the company as I recall came to talk to a room of engineers about the wide variety of calendar-related behaviors in various UNIX systems that were expected to cause problems for Y2K.

It was a very, very long list, often subtle issues, and I recall the concern in the room about the number of old systems in use by the DoD and others.

Anyway, no real point to this other than date handling is one of the hardest things to get right in computing, ranking right behind testing for the correct behavior.

[+] bentcorner|4 years ago|reply
> What an interesting and insidious bug! Bugs based on a time calculations can often show themselves later when view through a longer lens and scope of time...sometimes WAY longer than you'd expect.

My personal anecdote. I like playing online games, and as you know latency is the killer. I enjoyed playing in the evenings after work, and inexplicably I started noticing my latency spike from around 50ms to > 1s. Extremely frustrating.

I had no idea what caused this so I set up a simple ping command and had it save it to a graph.

Well, the next day I noticed the pings were steady throughout the whole day, then in the evenings I'd get these chunks of bad time. It turns out when my wife would watch Netflix in the other room (and it was only Netflix), it'd cause something to go awry with the router and latency would spike for me. (The really weird thing was that it was a combination of a Roku, Netflix, and a wired switch - change any of those and the problem went away).

Later during the pandemic, I also diagnosed drop-outs on my network due to kids in my neighborhood being online during school hours. Like clockwork I'd get a bad network from around 10pm and it'd be fine ending around 3 or 4. On school holidays and weekends my network was fine.

[+] robocat|4 years ago|reply
In the analogue days, before pixels existed, a customer had trouble with their phone line not working when the moon was full.

The problem was that they lived on the coast, and a subsurface junction box would get wet during king tides, causing the telephone line to fail.

[+] milliondollar|4 years ago|reply
F'n A. Reading the comments on this thread makes me love humanity. So much ingenuity, raw engineering horsepower, creativity. Goddamn, you are great people and you should be proud of yourselves. Reading this makes me believe we will survive as a species.
[+] avs733|4 years ago|reply
you'll appreciate this:

I worked with a factory that spent several years tracking down a quality problem. Eventual cause was wind direction...whenever they were down wind of the local cattle stockyard during a hot day.

[+] _kst_|4 years ago|reply
At a previous job, we used a bug tracking tool called "Remedy". On September 8, 2001, it started reporting dates incorrectly; its idea of the current date jumped back to 1973 and started advancing at 1/10 the normal rate.

It used Unix timestamps (seconds since 1970) and assumed they could only be 9 decimal digits. When the time reached 10 digits, the last digit was quietly dropped.

(It was fixed within a few days.)

[+] chongli|4 years ago|reply
I'm immediately reminded of NetHack [1] (which you can play online here [2])! Real-life phase of the moon has a small but important effect on gameplay in NetHack. The public game server on alt.org even tracks the phase of the moon to provide a handy reference.

A little bit disappointing to discover that the code from the article does not actually depend on the phase of the moon. I'm really interested to see the other stories here where it actually is the case that the phase of the moon is affecting people's code.

[1] https://www.nethack.org

[2] https://alt.org/nethack/

[+] tclancy|4 years ago|reply
Slightly related, I once wrote code for the old Kill Screen site that subtly altered the graphics of a review of a game themed around lunar stuff based on the current phase of the moon.

God bless gusts of random math people leave about.

[+] thadk|4 years ago|reply
I've resolved several "celestial body" problems with routers and modems in East/West Africa over the years by pointing USB fans at them – between the sun and the workday generating heat with higher load in lower-end routers or insufficiently-air-conditioned units, can work surprisingly well to improve the network at almost no cost.
[+] lmilcin|4 years ago|reply
I have seen interference from one part of set top box to cause noise on flash input lines and sometimes issue a command to clear flash on the device.

Months of debugging, dozen people involved, tens of thousands of devices bricked, tens of millions lost.

All due to a single line of code that configured flash to not require special magic before each command. This feature made to improve resistance to interference also hindered performance. Somebody thought it a good idea to disable to get some points for improved performance.