Stop relying on time #4

paddycarver · 2012-10-22T07:11:08Z

Using real time in a distributed application is apparently a Very Bad Idea (tm).

Use Lamport timestamps or vector clocks for detecting race conditions.

Find some equivalent of CLOCK_MONOTONIC for Node.lastHeardFrom.

Fix the Cluster.sendToIP timers to not use real time.

Source: http://www.reddit.com/r/programming/comments/11sgc7/pastry_a_distributed_hash_table_in_go/c6pagbn

paddycarver · 2012-12-16T11:44:14Z

Looks like the best bet for resolving conflicts when doing routing table updates is to use http://labix.org/govclock. It needs to be brought up to date to work with Go 1, and I need to find a way to send that patch to Gustavo to get it merged in. It seems simple enough (I have a quick and dirty version working now) but there are some unexplained discrepancies (an unknown Bug(string, interface{}) method in tests, mainly) that I'd like to look into a bit more before I feel confident it works as expected. Also need to decide what needs versioning in the vector clock--I'm tempted to say the state table changes for the node sending the state tables, but that could use a bit more thought.

Because lastHeardFrom doesn't actually impact the performance of Wendy (it's provided as a helper method for applications built on Wendy, etc.), I'm not sure the pain to implement a monotonic clock is worth the dubious gains. The method mainly exists for debugging purposes: "gee, I haven't heard from Node X in two days, that's probably why I'm not getting messages from it--it fell off the face of the earth." Worth fixing eventually, perhaps, but not a pressing concern.

Likewise, I'm not sure the timers should not use real time. That's standard Go practice, and I think the cases in which it will be problematic are fairly limited. The only thing I can think of is a clock jump occurring while a message is in transit, and the worst case scenario there is that 1) you get an error (not good, but the algorithm should be built to be error-resilient) or 2) you have a request that takes longer than it's supposed to be able to, which probably isn't the end of the world. Again, definitely a bug, but I'm not sure it's a high enough priority to focus on it now.

The vector clocks, however, should be implemented.

paddycarver · 2013-05-16T08:28:21Z

This has been resolved as of the beta1 release. While timeout detection still depends on the local clock, that's the standard Go timeout practice, so I'm going to stick with it. That is the only place the system clock is used now.

paddycarver mentioned this issue Jan 18, 2013

"Detected race condition" #13

Closed

ghost assigned paddycarver Jan 18, 2013

paddycarver closed this as completed May 16, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop relying on time #4

Stop relying on time #4

paddycarver commented Oct 22, 2012

paddycarver commented Dec 16, 2012

paddycarver commented May 16, 2013

Stop relying on time #4

Stop relying on time #4

Comments

paddycarver commented Oct 22, 2012

paddycarver commented Dec 16, 2012

paddycarver commented May 16, 2013