runtime: use CLOCK_BOOTIME, not CLOCK_MONOTONIC, when possible #24595

ianlancetaylor · 2018-03-29T14:58:33Z

On GNU/Linux since 2.6.39 clock_gettime supports CLOCK_BOOTTIME, which is equivalent to CLOCK_MONOTONIC except that it also accounts for time when the computer is asleep. We should prefer CLOCK_BOOTTIME to CLOCK_MONOTONIC when it is available. That will permit time.Time.Sub to return the correct value across periods when the computer has gone to sleep.

See #23178.

The text was updated successfully, but these errors were encountered:

rsc · 2018-03-30T02:47:30Z

Define "correct". I am not convinced.

rsc · 2018-03-30T02:54:49Z

Sorry. That was too short.

I think it's plausible that some people will want to exclude the time when the computer is asleep. I will also note that - I believe - the monotonic time always matches runtime.nanotime, and I'm not sure the runtime timers necessarily want to include the time when the computer is asleep.

Or imagine calling time.Ticker(1*time.Hour) to do some kind of cleanup once per hour. If the computer was off for 50 minutes, do you want the tick after 10 minutes of program execution or not? I'm not sure you do.

I can also imagine programs depending on one or the other behavior. If all current systems agree that monotonic time does not include time the computer is asleep, then I'm not sure we should introduce a Linux-specific variation. (I'm assuming - maybe incorrectly - that other systems typically don't support this.)

My point is only that I don't think it's a slam dunk we should definitely do this. I'd like to know more about what other systems do. For example, do Python and Java's monotonic time measurements include time spent asleep? How many operating systems provide access to time spent asleep?

mpx · 2018-04-01T04:14:49Z

There is an interesting comparison to be made with suspending a single Go process (SIGSTOP/delay/SIGCONT). Over a shorter time frame an overloaded system might look similar as well - real time passes without any opportunities to process events due to cpu starvation.

Processes can't control whether they experience some form of cpu starvation, either via system suspend, process suspend, or overload. They probably need to handle events occuring without much processing time to be reliable. Eg, a process doing an hourly cleanup needs to handle the cleanup occurring "too soon" otherwise it should track against amount processed, not real time.

It is surprising that t1.Sub(t0) doesn't return duration in realtime, but an ambiguous "time spent while system (not necessarily process) was running". Other languages without Go's monotonic aware time.Now will return the expected realtime duration in a similar situation.

mpx · 2018-04-16T14:26:52Z

CLOCK_MONOTONIC will behave the same as CLOCK_BOOTTIME as of Linux 4.17 (advancing time while suspended). Details here.

This may be a good argument to use CLOCK_BOOTTIME for consistency between kernels.

ash2k · 2018-04-16T23:28:29Z

It is surprising that t1.Sub(t0) doesn't return duration in realtime, but an ambiguous "time spent while system (not necessarily process) was running". Other languages without Go's monotonic aware time.Now will return the expected realtime duration in a similar situation.

If I understand correctly, this is also a change in behaviour compared to the previous implementation when time.Now was not monotonic aware.

zx2c4 · 2018-05-04T01:41:56Z

Oh, I just opened a duplicate of this. Yes, please make this change. Without it, implementing certain network protocols reliably for Android devices is really miserable.

gopherbot · 2018-05-04T01:52:18Z

Change https://golang.org/cl/111356 mentions this issue: runtime/linux: use CLOCK_BOOTTIME in nanotime()

bradfitz · 2018-05-04T01:57:06Z

https://github.com/golang/go/wiki/MinimumRequirements#linux says we currently require Kernel version 2.6.23.

Would this require bumping our minimum kernel version?

Or would we use this conditionally at runtime?

zx2c4 · 2018-05-04T02:09:26Z

BOOTTIME requires 2.6.39 indeed. I just checked old kernels, and fortunately clock_gettime returns -EINVAL if you pass it too new of a value, which means you could easily try BOOTTIME and fall back to MONOTONIC, if you're on a kernel that doesn't support it. Or you could just raise the minimum kernel requirement. Either way is fine with me. (It's worth noting that the oldest kernel supported by kernel.org is 3.2.)

slrz · 2018-05-07T20:44:14Z

Note that the change to make CLOCK_MONOTONIC behave like BOOTTIME in Linux 4.17 got reverted quickly due to causing compatibility issues. It won't happen, at least not for a very long time, if ever.

https://www.spinics.net/lists/linux-tip-commits/msg43709.html

zx2c4 · 2018-05-07T20:46:12Z

Bummer, though still not a reason for us to inherit Linux's legacy problems.

slrz · 2018-05-07T20:59:32Z

Agreed. It's just that the same kind of issue (like the systemd watchdog timeouts triggering on resume) may very well lurk in Go programs, too. It's probably worth a try, reverting the change before release if things blow up.

zx2c4 · 2018-05-09T15:36:17Z

Welp, this is unfortunate, but I'm doing this now downstream: https://git.zx2c4.com/wireguard-android/commit/?id=7f8799a4d44058d2fe0981841b8b6d825f97aee7

(The good news is that it actually works pretty well.)

eliasnaur · 2018-05-20T06:06:45Z

FWIW, macOS added mach_continuous_time in macOS 10.12. It is like mach_absolute_time but advances during sleep.

FiloSottile · 2018-05-22T16:59:41Z

Tagging this NeedsDecision as the CL is blocked on @rsc approving.

While I see the risk of having the same timeout issues the Linux kernel had, I think our case is different because it’s exposed through an interface that normally refers to real time. The best argument I’ve seen for this change is that without it time.Since can return very surprising results.

zx2c4 · 2018-05-22T17:03:56Z

The best argument I’ve seen

It also makes stateful network protocols extremely difficult to implement without this, particularly cryptographic ones, where cleaning up old keys is important.

rsc · 2018-06-25T20:44:36Z

OK, feel free to try this early in Go 1.12.

zx2c4 · 2018-06-25T21:54:55Z

Okay. I'll set a reminder for myself to do this when the 1.12 tree opens in August.

zx2c4 · 2018-09-04T17:48:23Z

OK, feel free to try this early in Go 1.12.

This has now been submitted at https://go-review.googlesource.com/c/go/+/111356 for merging.

andybons · 2019-10-15T17:34:43Z

It's too late in the cycle for this to land in 1.14 (trusting the early-in-cycle classification). @zx2c4 if you still plan to work on this you are more than welcome to do so, but it can't land until the 1.15 tree opens in Feb 2020. Feel free to move it to the Go1.15 milestone if that's the case.

zx2c4 · 2019-10-15T17:39:36Z

Indeed it's too late, and this actually requires some other changes too: we have to use futex and semaphores with sleep that corresponds with BOOTTIME.

andybons · 2019-10-15T18:49:04Z

Thanks for the update, @zx2c4 :)

networkimprov · 2019-10-20T09:04:10Z

Maybe this calls for a new stdlib API, since changing the way time.Timer works isn't workable for all callers.

How about type time.WallTimer?

ianlancetaylor · 2019-10-20T14:06:56Z

I believe that most systems are moving toward incrementing monotonic time while the system is asleep. As @mpx notes above, the Linux kernel is making CLOCK_MONOTONIC the same as CLOCK_BOOTTIME.

I'm not convinced that we need two different kinds of timers.

zx2c4 · 2019-10-20T15:02:36Z

I'm not convinced that we need two different kinds of timers.

I'm not convinced either. One is fine. And we should simply make them all function as CLOCK_BOOTTIME. Windows is already there. The other platforms will require a bit of poking and prodding. I'll see if I can find some time to have another try at it soon, since I'm shipping a patched Go runtime anyway for this.

networkimprov · 2019-10-20T19:23:53Z

The Linux kernel change was apparently abandoned; #24595 (comment)

I don't understand why you'd eliminate the absolute timer. I use timers in select for channel receive ops where I'm waiting on an ack. If the timer fires, I retry whatever wants the ack. If the system sleeps, I wouldn't have got the ack, so I wouldn't want the timer to fire on resume.

If Go timers don't work the same on Windows as other platforms, that's a likely source of porting problems, and it's not documented, yikes.

gabzim · 2019-10-21T04:56:00Z

If the system sleeps, I wouldn't have got the ack, so I wouldn't want the timer to fire on resume.

This is just one use case, there are plenty with different behavior needed. Personally, I think it's not intuitive that we use a clock that's not advanced when the computer sleeps. IMHO, the computer sleeping throughout execution is a detail that should be transparent to you, otherwise you can't trust your timers.

Given this code:

func main() {
	now := time.Now()
	tenMinsFromNow := 10 * time.Minute
	timer := time.NewTimer(tenMinsFromNow)

	logrus.WithField("now", now.Format(time.RFC1123)).WithField("shouldFireAt", now.Add(tenMinsFromNow).Format(time.RFC1123)).WithField("expectedDelay", tenMinsFromNow).Infof("starting alarm...")
	firedAt := <- timer.C

	logrus.WithField("now", time.Now().Format(time.RFC1123)).WithField("actuallyFiredAt", firedAt.Format(time.RFC1123)).WithField("actualDelay", firedAt.Sub(now)).Infof("ending alarm...")
}

if after <-time.C you go to sleep, when you come back the timer will fire much later than the given time (depending how much you sleep) and actualDelay will still read 10 mins (because that's the monotonic clock difference). In my use case, I'm triggering notifications & alarms to users, I don't care if the computer sleeps in between, I need those notifications to go out at the right time or when the computer wakes, the fact that it slept in between the time when the alarm was set and the time it was fired is irrelevant to me.

The end result is that timers are no longer predictable. If I set it to fire in 10 mins, will it? who knows, maybe, maybe it'll fire in 15 mins, maybe 20. Using CLOCK_MONOTIC means time stops (when computer sleeps), I think it's safer to say that the more intuitive use case is one in which time doesn't stop.

networkimprov · 2019-10-21T05:14:13Z

@gabzim you missed my suggestion for type WallTimer above, which does what you want.

mpx · 2019-10-22T00:40:23Z

Program necessarily receive control some time after the delay specified, ideally not too long afterwards. However, there are many reasons why a program may not get control "quickly":

CPU starvation / scheduling, memory paging,..
SIGSTOP / SIGCONT
Suspend / resume

The best we can hope for is that the OS/runtime doesn't unnecessarily delay providing control to the program. Programs need to be written to cope with this uncertainly and can't assume some amount of processing has occurred in-between timers firing, otherwise they are likely buggy.

In this case, I don't think we need a new timer, we just need the runtime to provide control as close to wallclock duration as is practical. At least, that allows the program to delay further if it is more appropriate.

networkimprov · 2019-10-22T02:23:41Z

@as has suggested the API func NewTimerAt(t Time) *Timer here #35012 (comment)

A variation of that is func NewTimerWall(d Duration) *Timer which is easier to apply to existing code.
EDIT: alternative name: func NewTimerBoot(d Duration) *Timer

Either would provide the functionality requested by this issue, and both are preferable to my previous suggestion of type WallTimer.

networkimprov · 2019-11-09T20:24:38Z

The Windows runtime was changed in 1.13.3 to the boot-time model, without reference to this discussion. Microsoft has asked us to revert that. See #35482

@rsc @ianlancetaylor there is clearly a need for a boot-time timer; would the Go team look favorably on a proposal for func NewTimerBoot(d Duration) *Timer or similar?

mpx · 2019-11-12T04:36:21Z

The arguments behind selecting BOOTTIME or MONOTONIC are fairly nuanced (with the only difference being across suspend). I'm not convinced there is that much value adding a second timer, and even if it's added, that developers will evaluate or correctly determine the best choice for their application.

time.Time.Sub and time.Since currently return invalid results across suspend. time.Time needs to use BOOTTIME to fix this. The current situation is surprising and likely to cause hard to debug effects. This is unique to Go, so wouldn't have been a factor on other platforms when deciding the balance between BOOTTIME and MONOTONIC.

If time.Time uses BOOTTIME, it makes sense that timers would too.

This isn't perfect in all situations, there are advantages/disadvantages in both directions. With MONOTONIC the timer may expire long after the application intended with no recourse. With BOOTTIME the timer may expire before the application has done "enough work" (but it can at least can adapt and set another timer).

With BOOTTIME buggy use of timers may fail, with MONOTONIC correct use of timers may fail (Eg, #25248, #35012, #29308).

networkimprov · 2019-11-12T05:51:05Z

Quoting @jstarks from #35447 (comment):

The Windows kernel team changed timer behavior in Windows 8 to stop advancing relative timeouts on wake. Otherwise when you open your laptop lid, every timer in the system goes off all at once and you get a bunch of unpredictable errors. Software is generally written to assume that local processes will make forward progress over reasonable time periods, and if they don't then something is wrong. When the machine is asleep, this assumption is violated. By making relative timers behave like threads, so that they both run together or they both don't, the illusion is maintained. You can claim these programs are buggy, but they obviously exist. Watchdog timers are well-known constructs.

This was a conscious design decision in Windows, and so it's disappointing to see the Go runtime second guess this several years later in a bug fix. [see #34130]

As far as behavior on Linux, there is clearly no consensus in issue #24595, which discusses this same problem. And indeed you can see that the CLOCK_MONOTONIC/CLOCK_BOOTTIME convergence was reverted in the kernel exactly because of the reason we stopped advancing time in Windows: random code has random failures due to timeouts. See https://lkml.org/lkml/2018/4/26/929 for a brief justification.

networkimprov · 2019-12-12T19:45:22Z

Re NewTimerAt() and related ideas, see #35482 (comment)

ianlancetaylor · 2019-12-13T02:26:51Z

I'm withdrawing this proposal. The lengthy discussion across #31528 and #35482 and related issues has convinced me that this is ill advised.

gopherbot · 2019-12-13T20:10:45Z

Change https://golang.org/cl/211307 mentions this issue: runtime: use monontonic time consistently on Windows

ianlancetaylor added the NeedsFix The path to resolution is known, but the work has not been done. label Mar 29, 2018

ianlancetaylor added this to the Go1.11 milestone Mar 29, 2018

cespare mentioned this issue May 4, 2018

runtime: use CLOCK_BOOTTIME for nanotime() #25248

Closed

FiloSottile added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsFix The path to resolution is known, but the work has not been done. labels May 22, 2018

rsc modified the milestones: Go1.11, Go1.12 Jun 25, 2018

ianlancetaylor mentioned this issue Dec 17, 2018

net/http: investigate Transport's use of cached connections upon resume from sleep #29308

Closed

smasher164 added this to the Go1.14 milestone Oct 11, 2019

andybons modified the milestones: Go1.14, Unplanned Oct 15, 2019

ianlancetaylor mentioned this issue Oct 19, 2019

time: NewTimer firing later if computer sleeps, how to use wall clock? #35012

Open

jstarks mentioned this issue Nov 9, 2019

runtime: "fatal error: PowerRegisterSuspendResumeNotification failure" when running in Windows docker containers #35447

Closed

networkimprov mentioned this issue Nov 9, 2019

runtime: Timer buckets may get "stuck" for long periods of time after Windows 8/10 systems wake from sleep #31528

Closed

networkimprov mentioned this issue Nov 9, 2019

runtime: revert Windows change to boot-time timers #35482

Closed

ianlancetaylor closed this as completed Dec 13, 2019

networkimprov mentioned this issue Dec 14, 2019

time: add ExternalNow, etc for external time and timers #36141

Open

anjmao mentioned this issue Feb 20, 2020

Fix time.Sub after device resumes from sleep mysteriumnetwork/node#1749

Closed

WGH- mentioned this issue May 10, 2020

Clock is incorrent after suspend soumya92/barista#150

Closed

golang locked and limited conversation to collaborators Dec 12, 2020

gopherbot added the FrozenDueToAge label Dec 12, 2020

runtime: use CLOCK_BOOTIME, not CLOCK_MONOTONIC, when possible #24595

runtime: use CLOCK_BOOTIME, not CLOCK_MONOTONIC, when possible #24595

Comments

ianlancetaylor commented Mar 29, 2018

rsc commented Mar 30, 2018

rsc commented Mar 30, 2018 • edited Loading

mpx commented Apr 1, 2018

mpx commented Apr 16, 2018

ash2k commented Apr 16, 2018

zx2c4 commented May 4, 2018

gopherbot commented May 4, 2018

bradfitz commented May 4, 2018

zx2c4 commented May 4, 2018 • edited Loading

slrz commented May 7, 2018 • edited Loading

zx2c4 commented May 7, 2018

slrz commented May 7, 2018

zx2c4 commented May 9, 2018 • edited Loading

eliasnaur commented May 20, 2018

FiloSottile commented May 22, 2018

zx2c4 commented May 22, 2018 • edited Loading

rsc commented Jun 25, 2018

zx2c4 commented Jun 25, 2018

zx2c4 commented Sep 4, 2018

andybons commented Oct 15, 2019

zx2c4 commented Oct 15, 2019

andybons commented Oct 15, 2019

networkimprov commented Oct 20, 2019

ianlancetaylor commented Oct 20, 2019

zx2c4 commented Oct 20, 2019

networkimprov commented Oct 20, 2019 • edited Loading

gabzim commented Oct 21, 2019 • edited Loading

networkimprov commented Oct 21, 2019 • edited Loading

mpx commented Oct 22, 2019

networkimprov commented Oct 22, 2019 • edited Loading

networkimprov commented Nov 9, 2019 • edited Loading

mpx commented Nov 12, 2019 • edited Loading

networkimprov commented Nov 12, 2019

networkimprov commented Dec 12, 2019

ianlancetaylor commented Dec 13, 2019

gopherbot commented Dec 13, 2019

rsc commented Mar 30, 2018 •

edited

Loading

zx2c4 commented May 4, 2018 •

edited

Loading

slrz commented May 7, 2018 •

edited

Loading

zx2c4 commented May 9, 2018 •

edited

Loading

zx2c4 commented May 22, 2018 •

edited

Loading

networkimprov commented Oct 20, 2019 •

edited

Loading

gabzim commented Oct 21, 2019 •

edited

Loading

networkimprov commented Oct 21, 2019 •

edited

Loading

networkimprov commented Oct 22, 2019 •

edited

Loading

networkimprov commented Nov 9, 2019 •

edited

Loading

mpx commented Nov 12, 2019 •

edited

Loading