One second per second is harder than it sounds

[+] ChuckMcM|11 years ago|reply

For a long time synchronized clocks was the networking code's equivalent joke to Regular Expressions ("Now you've got two problems ...") That said, having implemented a 100ns accurate clock using the PPS on a $30 GPS module (Adafruit) and a Beaglebone Black (multiple) these days there are some interesting alternatives. It would be interesting if the Open Compute folks would specify a 1PPS GPIO input as part of the spec, you could easily deliver rack level 1PPS accurate signals throughout the data center pretty cost effectively (a few wires, a transmitter, a TDR tool. Not that it will ever be a 'solved' problem but it certainly makes some transactional protocols safer.

[+] rdtsc|11 years ago|reply

I've used PTP with special timestamping hardware (network cards) and could synchronize machines on a LAN pretty well. Now that doesn't guarantee global sync just sync between machines.

GPS NTP server can be obtained for as low as $600 or so. So those are becoming not as exotic now.

[+] sp332|11 years ago|reply

"A man with one watch always knows what time it is. A man with two watches is never quite sure..."

[+] kiram9|11 years ago|reply

There is a standard to keep time on LANs now: IEEE-1588 which is built into a lot of new network cards. It enables the NIC to time stamp incoming packets which can be used to achieve sub microsecond timing.

Normally it can be implemented with a single GPS receiver on the network that then feeds all servers/devices. It is starting to become very common in broadcast applications as a reference. Furthermore most manufactures already offer some application notes/libraries to make use of it. And linux offers PTPd which implements the standard.

http://ptpd.sourceforge.net/

http://en.wikipedia.org/wiki/Precision_Time_Protocol

http://www.nist.gov/el/isd/ieee/ieee1588.cfm

[+] bradknowles|11 years ago|reply

PTP is a good choice for synchronizing clocks on a per-segment basis, but it doesn't work across a WAN and there are lots and lots of devices out there that don't support it.

NTP is good for doing time sync across the WAN, but to get the most out of it for certain applications you need to hook it up to reference clocks.

So, NTP can use PTP as a reference clock, and Bob's your uncle.

[+] dm2|11 years ago|reply

28 people killed because a clock was off by .34 seconds (system hadn't been restarted in 100 hours)

http://fas.org/spp/starwars/gao/im92026.htm

[+] manojlds|11 years ago|reply

Isn't it actually "28 lives couldn't be saved" because of the clock issue. Also, the article states that the defense was never tried against SCUD and could have failed nevertheless.

[+] chubot|11 years ago|reply

I think the question is: when does it matter? What applications rely on an accurate clock? I guess it would matter if you are running some algorithm on logs that uses time as an input. Interested to hear any other examples.

DJB has a clockspeed package that addresses this problem. Anyone used it?

http://thedjbway.b0llix.net/clockspeed.html

It seems interesting but I haven't (yet) run into the need for super accurate clocks (or perhaps I am not running enough machines).

[+] sophacles|11 years ago|reply

Power grids require highly accurate, distributed clocks.

Well they don't strictly require them, but having that capability increases efficiency, robustness and safety.

They need to be accurate because electricity is fast - so a lot of protection schemes need to have a decision made across very long distances (which breaker, at which point on a 100 mile long transmission line should be fliped to isolate faults or stop "bad things" propagation) with minimal impact to the system as a whole.

They also need to be accurate because a good measure of frequency and phasor at geographically disparate points allows system tuning. A tiny change - e.g. speed up one of the 100s of generators could result in a much smoother, overall more-efficient transmission system. Similarly, adjustments to capacitor banks and transformers can provide slight phase adjustments for optimal power flow. But these resources are hundreds of miles apart - e.g the western half of the US and Canada (and parts of Mexico) (except for texas) is one big system.

Being a large, physical system, with feedback, and also being electricity, you get some very interesting effects to try and reason about. There can be very slow (relatively) oscillations in the system, but at the same time you have many, many high frequency noise signals mixed in to the elecrical signal. To find trends and patterns for the slow oscillations you need a very high sample rate to determine what noise to ignore. And these samples need to be time aligned, because even a perfect wave will have different phases at different points along it's propagation.

Anyway - that's an example I'm familiar with. I hope it made sense, seeing as I may not have had quite enough coffee yet.

[+] keeperofdakeys|11 years ago|reply

As soon as you have any kind of highly-distributed system, accurate clocks are usually very important. When events can start on one machine, and end on another, it's important to have some kind of consistency (knowing that one event occurred before another). Probably the most famous paper on this is Lamport's "Time, clocks, and the ordering of events in a distributed system" - http://dl.acm.org/citation.cfm?id=359563.

But to give a practical example. On something like facebook, if you receive two comments on a post within a millisecond, how do you determine the order that they appear in the list? Each request may hit a different datacentre in the world, yet somehow the machines determine the ordering, and every request sees that ordering from then on. You can't just have a single server, because that couldn't handle the load for the entire world. These kinds of things require very strict temporal ordering, which is why time is such an important thing. This leads to the CAP theorem, and distributed systems in general.

[+] michaelt|11 years ago|reply

Clock years out of sync: All SSL certificates appear invalid.

Clock minutes out of sync: "Here's a temporary security token, I don't think it's expired" "That credential has expired" "This never came up in testing, I think I'll crash"

Clock seconds out of sync: Log files on different servers can't be merged reliably. Different requests went to different load-balanced servers and now you can't tell what order the logged events happened in.

[+] MichaelApproved|11 years ago|reply

Communicating with other servers is a good example.

I need to have an accurate clock to within a few seconds when I sign Amazon S3 files. I have code that gives access to restricted files on S3. To give access, the customer makes a request from my server, I authenticate, sign a new URL and redirect them to the signed S3 URL.

For security reasons, the URL gets a signature that expires after a few seconds. The expiration date is based on X seconds from my servers clock. If my servers time differs from S3 server time then, according to S3, the link could be expired before it's even created.

In theory, a redirect takes a second or less to complete so you'd think an expiration that's 5 seconds in the future would be enough. The reality is, over time, the server clock can differ as much as 1 minute or so from S3's clock. So, I just set the expiration to be 120 seconds into the future and call it a day.

[+] ajanuary|11 years ago|reply

For a bunch of distributed systems nodes need to be roughly matching real time, but also within some tollerance of each other; it's simplest to make everything as accurate to real time as possible.

I work in broadcast automation. It's a distributed system and the clocks need to be within a certain tolerance of each other. Historically that's done by having a dedicated piece of timekeeping hardware that dictates the station clock, but recently we've been experimenting with using NTP.

We have one node (A) where you author time-sensitive events, and another node (B) that polls for pages of those events, loops through them and takes some action when they should occur. Without both nodes agreeing somewhat on the time you'll end up missing events.

[+] superpatosainz|11 years ago|reply

IRC: everything is timestamped to the subsecond. Any drifting causes channel takeovers and "random" nick kills, even links fail. Also, since a message from a server is broadcasted across the entire network, one misconfigured server can cause chaos in your hubs or completely unrelated leafs.

Or at least that's how I remember my experience managing an unstable (as in ddosed, every irc service in active development, a few in-house bots, etc) IRC network and having read the IRC and IRCv3 specs and unreal's and TS6 server message specs. If I'm wrong I'm sure someone here could correct me.

[+] dsymonds|11 years ago|reply

Several security protocols (e.g. Kerberos) depend on systems having a somewhat reliable clock. They'll issue some sort of authentication token on system A that a user is meant to present to system B; the token has some expiry time, and the security of the system depends on systems A and B having comparable clocks.

[+] tmarthal|11 years ago|reply

I know that offhand gaming and financial services rely on very accurate clocks at the application level. And you can't necessarily rely on the clients to tell you what their time is, as it can be spoofed.

[+] rayiner|11 years ago|reply

Wireless networking protocols may very precise time synchronization at the PHY/MAC layer. Generally, wireless technologies like WiMAX or LTE divide time into frames (e.g. LTE is 10 ms). Then depending on whether you're frequency duplexing (FDD) or TDD (time duplexing), the frames are divided into subframes with defined meaning. If different radios on the network don't agree as to frame start/end times, interference occurs.

Here's an interesting paper describing the software implementation of certain features that are usually handled by the hardware (well, by software running on the baseband processor): http://people.freebsd.org/~sam/FreeBSD_TDMA-20090921.pdf.

[+] jrockway|11 years ago|reply

I've used clockspeed. tlsdate + clockspeed turns out to be pretty reliable, and a lot less code than ntpd.

[+] jfoutz|11 years ago|reply

spanner? https://en.wikipedia.org/wiki/Spanner_(database)

[+] spydum|11 years ago|reply

Since the advent of virtualization (most notoriously VMware ESX) time keeping has gotten to be much more difficult. I know back in my unix admin days, I spent more than a few days on problematic Linux kernels on VMs under ESX (made far worse when hosts were over provisioned and VMs were starved for CPU readiness). I know later enhancements like tick-less kernels may have helped, but I do not envy anyone who has to wrestle these beasts.

[+] ghshephard|11 years ago|reply

I spent about 2+ weeks in early 2003/2004 trying to get VMware Server to properly keep the time in Sync. I had memorized the VMware whitepaper on the topic, and had followed every possible bit of advice they offered (including not using NTPD) - nothing worked. Eventually just cronned ntpdate to run every minute and that resolved the issue on all our systems.

This, by the way, horrified all our NTPD theoreticians, who made it clear that running ntpdate was going to cause catastrophic things to occur in our Operations environment - but, given that our logs were all getting progressively more and more useless as we were unable to correlate times for events between servers - the worst case scenario (in my mind) had already occurred.

I don't recall any particularly negative side effects as a result of our ntpdate sledgehammer.

I presume things have gotten better in the last 10 years with VMware.

[+] MichaelGG|11 years ago|reply

Windows time sync only cares about being accurate to 5 !minutes, for Kerberos. HyperV and makes Windows not even able to be that accurate. Oddly enough, Linux guests have no trouble with subsecond accuracy.

I'm told the w32time codebase is not something MS employees like to look at.

[+] isomorphic|11 years ago|reply

I run various production systems in both VMware ESX and VirtualBox, and I have time issues in all VMs, regardless of hypervisor vendor. The VM tools are installed and operating "properly." The host machine clocks are synced correctly with ntp. Yet the VMs will sometimes get tens of minutes out of alignment in short order.

At this point I'm thinking I need to run ntp on the physical hardware with the least jitter, then just run ntpdate in a cron job on all the VMs. This would work better than ntp clients in the VMs or the VM tools.

[+] joevandyk|11 years ago|reply

Every so often, my linux vm running on VMware fusion gets really out of date. I tried the "sync time" feature in VMware, didn't fix it.

Eventually I had ntpdate running every minute via cron. Pretty annoying.

[+] lamontcg|11 years ago|reply

Run a cronjob every hour which:

a. ensures ntpd is running and starts it if its not.

b. checks `nptq -rv` and looks for the self-reported state of ntp and bounces if it is not sync'd.

c. 'manually' checks ntpdate for the drift against a known stratum 1 and bounces the daemon if the drift is out of bounds (something like 50ms is a pretty good cutoff since ntpd normally does much better than that).

You can run this out of chef/puppet/whatever but it needs to run with a frequency of about an hour (faster than that and ntpd often won't settle well, and slower than that and you can be needlessly waiting too long to fix problems). Running it as a cronjob uncouples it from how often your config management runs.

This will catch all kinds of issues -- crashing ntpds, ntdps that randomly lose sync, kernel problems that keep them from keeping sync, ntpds that lie about their sync status, etc.

I set it up to e-mail me as a form of monitoring (with heavy-duty procmail filters).

Had 30,000 servers with this script running against it. I think 6 of them had shitty clocks that were getting reset every single hour, but were still keeping time correctly to under a second. Every couple days there would be a server that'd flip out and send a few e-mails until it quieted down and synch'd. At one point we had a shipment of several hundred new servers that all had issues with kernel drivers and the cronjob was bouncing ntpd constantly on them until a kernel upgrade fixed the problem.

I also used to bounce ntpd once a night, but had to take that out because it would cause non-monotonic slew in the clock which timers around service calls would turn up as negative seconds that due to unsigned int conversion would turn into 4 billion second p100 times.

You do want to be careful about things like network outages, if you can't ping your upstream stratum1/2 its probably better to leave things as it is. And again, you really want to limit the frequency that you bounce it at, and you want to be aware of servers where it gets into a state where you're bouncing ntpd constantly.

[+] easytiger|11 years ago|reply

This is a terrible idea when dealing with an application other than "users checking the time on their system clock".

Performing hard resets of the clock continually is a catastrophic infrastructure failure. A high performance 24/7 application that relies on sub millisecond let alone sub microsecond accuracy across multiple physical systems will be utterly destroyed by this approach.

[+] kijin|11 years ago|reply

In the age of mobile phones that can get the time directly from GPS satellites with sub-100ns accuracy, the isolated hardware clock of a typical server strikes me as an arcane method of keeping time.

Is there an expansion card or USB device that can get the time from GPS and feed it to something like ntpd (and adjust for leap seconds from data downloded from other networks)? Or would that be impractical because of all the walls and metal cages that surround the typical server? What about an antenna on the roof of the datacenter that supplies GPS time to every server in the building?

[+] rachelbythebay|11 years ago|reply

Datacenter GPS-backed time sources exist. You can get little boxes which sit in a rack and hook to an antenna and feed nice time to the rest of the place. Those other machines pick it up with ntpd.

The problem is when your system clock is so far out of whack (and/or unpredictable) that even ntpd won't help you. Get enough supposedly-identical machines running and this will probably bite you too.

[+] cnvogel|11 years ago|reply

While most smartphones, theoretically, could get their time from GPS... do any use this method? As far as I know, my Android handsets get their time via NTP.

Ironically, often a misconfigured NTP server screws up GPS fixes (it somehow seems to be necessary for assisted GPS, which relies on the cooperation of the mobile network or Internet to faster acquire a GPS fix).

http://stackoverflow.com/questions/8308412/gps-how-ntp-time-...

[+] x1798DE|11 years ago|reply

>In the age of mobile phones that can get the time directly from GPS satellites with sub-100ns accuracy, the isolated hardware clock of a typical server strikes me as an arcane method of keeping time.

Genuinely curious, is it actually true that you can get sub-100ns time accuracy on a mobile phone? Is this because you have a direct connection with the satellite, which is calculating where you are? I would have assumed that the uncertainty on the latency of any such connection would be higher than 100 ns.

[+] pjc50|11 years ago|reply

Yeah, GPS doesn't really work indoors. You can certainly get such devices, or even do it with consumer GPS, but then you have to account for the latency of USB.

[+] rdtsc|11 years ago|reply

You can buy GPS receivers and install antennas. Those are pretty good but expensive.

For S2, it can be configured to never jump. Instead you can just the time manually (run the equivalent of 'ntpdate <server>') at boot perhaps. Then don't start your software until ntp has been synchronized. Then maintain a watchdog that will alert you if you drop to a too low of a stratum.

Don't have an answer for S4 and S5. But I have heard that chrony is a good replacement for ntp (same network protocol but different discipline, faster conversion).

http://chrony.tuxfamily.org/

For a whole new other thing there is PTP (https://en.wikipedia.org/wiki/Precision_Time_Protocol). This is good if you want to synchronize machines to each other to withing a couple of hundred micros (maybe more with special network cards and switches).

[+] kastnerkyle|11 years ago|reply

The question is - how often do you reboot? For long running high uptime applications "ntpdate at boot" doesn't cut it... and the other solutions are nasty. In my experience, ntpd will ignore clocks almost at a whim even if they are stratum 0! Some stratum 6 clock decides to report a different time, and suddenly all your other sources are ignored until an ntpd restart. If your clock drifted too much... even ntpd restart with options to "force the time" can fail if the clock is too far out of whack. The you are back to forcing ntpdate - hope none of your processes are time dependent when system time jumps by 3 or 4 seconds.

Half of the GPS PCI devices require you to hack ntpd directly to get anything working, and the Ethernet sources have a strange tendency to go belly up after a year or 2 of running on non-120V sources. Not to mention that if you run the antenna cable too long, you might not notice that your GPS amplitudes are a little low until the next cloudy day. Or if someone adds a new hunk of metal on your roof, or blocks line of sight to some portion of the sky...

Long story short, time is a nightmare and I am glad other people think so too.

[+] bradknowles|11 years ago|reply

With appropriate configuration, you can get ntpd to settle down and consider itself to be fully synchronized in a matter of seconds after startup.

You just have to configure an adequate number of upstream time servers, add judicious use of iburst, and you should be good to go.

On my laptop, I usually get good timesync within ten to fifteen seconds of startup.

[+] socialist_coder|11 years ago|reply

I have a problem in my AWS Windows cluster with the clocks as well. We have a mobile game with a tournaments feature, and tournaments can be as short as 5 minutes. So, it's important that all the clients get the same "end time".

Normally, at least 10% of the VMs are between 5 and 20 seconds off from the average.

By default, the VMs are syncing their time to an NTP server, but it just doesn't seem to be enough. I haven't changed any settings as of yet.

What are some of the things I could do on Windows box to sync up the clocks in my cluster? Scheduled shell scripts to force more clock syncs? Or maybe something on the application level?

[+] bradknowles|11 years ago|reply

First off, Windows isn't the greatest client for trying to get good timesync. The way the OS usually handles interrupts isn't sufficiently precise and accurate. That said, it's not uncommon for real-hardware Windows clients to get decent timesync down into the double-digit millisecond range -- usually about ~40-50ms or so.

Second, you're trying to run them in a virtualization environment, and that plays holy hell with whatever good timesync you might otherwise have been able to get. And AWS is probably about as bad as you could reasonably get in this regard -- VMWare has had a long time to get to where they are with regards to timesync for client OSes, and that's not particularly good.

So, your best bet may be to run NTP clients on the Windows machines, but to set them up to run at an artificially high rate of updates. Or maybe even run ntpdate periodically against good real-hardware time servers that are available to you.

You're starting off with both hands tied behind your back, and that's a hard place to come back from.

[+] easytiger|11 years ago|reply

Err no mention of PTP at all?

There are hardly any oscillators which can be plugged into a host which have any decent holdover. One should never expect them to run in reasonable time. Standard TCXO prob losses 100 ms a day. Rubidium loses 0.01 ms a day. Time providers include the perceived accuracy of the clock in the protocol so that the client can make a decision about which time source to use.

Relying on the oscillator is never done in time critical applications except in emergency states where GPS is out of service.

Of course a lot of people run relative clock domains, which is just a pretty bad idea for most applications.

[+] ars|11 years ago|reply

I had an S5 motherboard (ASRock P55 Delux), after pulling out my hair I finally gave up on it and bought a new motherboard.

And I made a resolution to never ever buy the V1 of anything. (It was a brand new Intel chipset, and a brand new motherboard design to go with it.)

If everyone did like me and never bought V1 then there would never be a V2, I don't have a solution for that, but I refuse to be the guinea pig.

[+] makomk|11 years ago|reply

I had a desktop PC a few years back that exhibited S5. Can't remember the exact details, but I think it turned out that Linux was using a time source which wasn't stable on that particular hardware and switching to a different time source fixed it. There's been a lot of issues like that in the past, most of which are thankfully no more.

[+] feld|11 years ago|reply

FreeBSD timecounter in detail:

http://phk.freebsd.dk/pubs/timecounter.pdf

[+] jchonphoenix|11 years ago|reply

Um... In large scale systems... vector clocks? Or am I not reading this correctly (I skimmed)

[+] MrBuddyCasino|11 years ago|reply

Not sure why downvoted, vector clocks are an interesting alternative. Its just that they won't fix your log timestamps or kerberos clock skew errors.

97 comments