Added a spec regarding the rules for eviction & replacement of pods #133

ewoutp · 2018-05-10T08:32:16Z

No description provided.

neunhoef

At least some discussion has to happen and a few typos corrected.

neunhoef · 2018-05-14T11:49:31Z

docs/design/pod_evication_and_replacement.md

+
+## Replacement
+
+Replacement is the process of replacing a pod an another pod that takes over the responsibilities


"pod an another" -> "pod by another"

neunhoef · 2018-05-14T11:50:15Z

docs/design/pod_evication_and_replacement.md

+
+### Image ID Pods
+
+The Image ID pods are starter to fetch the ArangoDB version of a specific


starter -> started

neunhoef · 2018-05-14T11:51:02Z

docs/design/pod_evication_and_replacement.md

+- Image ID pods can always be restarted on a different node.
+  There is no need to replace an image ID pod.
+- `node.kubernetes.io/unreachable:NoExecute` toleration time is set very low (5sec)
+- `node.kubernetes.io/not-ready:NoExecute` toleration time is set very low (5sec)


Add (if true): There is no danger at all if two image ID pods happen to run at the same time.

neunhoef · 2018-05-14T11:52:02Z

docs/design/pod_evication_and_replacement.md

+- Coordinator pods can always be evicted from any node
+- Coordinator pods can always be replaced with another coordinator pod with a different ID on a different node
+- `node.kubernetes.io/unreachable:NoExecute` toleration time is set low (15sec)
+- `node.kubernetes.io/not-ready:NoExecute` toleration time is set low (15sec)


Add? "There is no danger at all if two coordinator pods with different ID run concurrently.

Done (a bit different)

neunhoef · 2018-05-14T11:52:27Z

docs/design/pod_evication_and_replacement.md

+### DBServer Pods
+
+DBServer pods run an ArangoDB dbserver as part of an ArangoDB cluster.
+It has persistent state potentially tight to the node it runs on and it has a unique ID.


"tight" -> "tied"

neunhoef · 2018-05-14T12:06:21Z

docs/design/pod_evication_and_replacement.md

+### Single Server Pods
+
+Single server pods run an ArangoDB server as part of an ArangoDB single server deployment.
+It has persistent state potentially tight to the node.


"tight" -> "tied"

neunhoef · 2018-05-14T12:06:48Z

docs/design/pod_evication_and_replacement.md

+### Single Pods in Active Failover Deployment
+
+Single pods run an ArangoDB single server as part of an ArangoDB active failover deployment.
+It has persistent state potentially tight to the node it runs on and it has a unique ID.


"tight" -> "tied"

neunhoef · 2018-05-14T12:07:28Z

docs/design/pod_evication_and_replacement.md

+  - It is a follower of an active-failover deployment (Q: can we trigger this failover to another server?)
+- Single pods can always be replaced with another single pod with a different ID on a different node.
+- `node.kubernetes.io/unreachable:NoExecute` toleration time is set high to "wait it out a while" (5min)
+- `node.kubernetes.io/not-ready:NoExecute` toleration time is set high to "wait it out a while" (5min)


Need to check this, do not know by heart.

neunhoef · 2018-05-14T12:08:22Z

docs/design/pod_evication_and_replacement.md

+- SyncMaster pods can always be evicted from any node
+- SyncMaster pods can always be replaced with another syncmaster pod on a different node
+- `node.kubernetes.io/unreachable:NoExecute` toleration time is set low (15sec)
+- `node.kubernetes.io/not-ready:NoExecute` toleration time is set low (15sec)


Is there any requirement about the same network endpoint or an internal k8s service being set up in case of a replacement?

neunhoef · 2018-05-14T12:08:47Z

docs/design/pod_evication_and_replacement.md

+- SyncWorker pods can always be evicted from any node
+- SyncWorker pods can always be replaced with another syncworker pod on a different node
+- `node.kubernetes.io/unreachable:NoExecute` toleration time is set a bit higher to try to avoid resynchronization (1min)
+- `node.kubernetes.io/not-ready:NoExecute` toleration time is set a bit higher to try to avoid resynchronization (1min)


Same here about network endpoint.

Added a spec regarding the rules for eviction & replacement of pods

6c612d1

ewoutp added the 9 Review Required label May 10, 2018

ewoutp requested a review from neunhoef May 10, 2018 08:32

neunhoef requested changes May 14, 2018

View reviewed changes

ewoutp added 3 commits May 14, 2018 14:58

Typos

ab28b50

Typo

d7f2ccb

Changed agent tolerations

cddb02b

ewoutp merged commit d82bc7f into master May 14, 2018

ewoutp deleted the documentation/eviction-and-replacement-spec branch May 14, 2018 14:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a spec regarding the rules for eviction & replacement of pods #133

Added a spec regarding the rules for eviction & replacement of pods #133

ewoutp commented May 10, 2018

neunhoef left a comment

neunhoef May 14, 2018

neunhoef May 14, 2018

neunhoef May 14, 2018

neunhoef May 14, 2018

ewoutp May 14, 2018

neunhoef May 14, 2018

neunhoef May 14, 2018

neunhoef May 14, 2018

neunhoef May 14, 2018

neunhoef May 14, 2018

ewoutp May 14, 2018

neunhoef May 14, 2018

ewoutp May 14, 2018


		## Replacement

		Replacement is the process of replacing a pod an another pod that takes over the responsibilities


		### Image ID Pods

		The Image ID pods are starter to fetch the ArangoDB version of a specific

Added a spec regarding the rules for eviction & replacement of pods #133

Added a spec regarding the rules for eviction & replacement of pods #133

Conversation

ewoutp commented May 10, 2018

neunhoef left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment