Add support for Consul 0.5.0 and Atlas auto-join #135

shanesveller · 2015-02-24T02:08:03Z

Commit adfa120 can probably be ignored from mainline. I took a glance at rebasing this off of develop branch,
but it seems like it's in the middle of a sizeable rewrite.

Please keep or discard as much as you the maintainers find useful, in light of #126 etc.

coveralls · 2015-02-24T02:14:27Z

Coverage remained the same at 100.0% when pulling adfa120 on shanesveller:consul-0.5.0 into db93cb6 on johnbellone:master.

.kitchen.yml

+        atlas_cluster: <%= ENV.fetch('ATLAS_CLUSTER', 'example/cluster') %>
+        atlas_token: <%= ENV.fetch('ATLAS_TOKEN', 'NOT_REAL') %>
+    excludes:
+      - centos-7.0


donaldguy · 2015-03-10T20:42:41Z

Ideally this should support auto-bootstrapping the cluster with bootstrap_expect; but presently the logic in _service.rb countermands that.

I have prototyped that here: https://github.com/donaldguy/consul-cookbook/commit/7cf273aaffc220af6dc42cd74ddb76baba168015

It worked for me, starting from an up and associated with Atlas but unclustered set of servers, bringing them all down, deleteing the contents of the data dir (probably could just done /raft and/or /serf) and bringing them up one at a time.

johnbellone · 2015-03-14T13:52:27Z

@shanesveller Please rebase!

shanesveller · 2015-03-16T23:23:55Z

@johnbellone I've rebased my branch, but it looks like CentOS Kitchen tests are currently failing - maybe on master too. Will investigate more tomorrow.

coveralls · 2015-03-16T23:24:46Z

Coverage remained the same at 100.0% when pulling 39dd110 on shanesveller:consul-0.5.0 into 0a8b2f2 on johnbellone:master.

coveralls · 2015-03-16T23:24:49Z

Coverage remained the same at 100.0% when pulling 39dd110 on shanesveller:consul-0.5.0 into 0a8b2f2 on johnbellone:master.

donaldguy · 2015-03-17T03:12:43Z

Also please make sure something with bootstrap_expect support lands; I also really don't think you want the retry_join or start_join server lists if you are working with atlas_join

I have all that on the branch(es) here: https://github.com/donaldguy/consul-cookbook/commit/7cf273aaffc220af6dc42cd74ddb76baba168015

I haven't tried the full test suit but at least on Ubuntu 14.04 its working great in my production env.

shanesveller · 2015-03-17T18:33:23Z

@johnbellone Rebased one more time - CentOS tests fail on master too, and my company doesn't use it so I'm afraid I can't really invest any time digging further.

@donaldguy I'm not clear on what your concerns are, can you elaborate here in the comments? My company tested this code as-is with an EC2 auto-scale group created from Packer-built AMIs and had no trouble scaling the cluster from 0 to 3 and up-and-down from there. The Atlas guide and Configure documentation don't seem to mention retry_join/start_join in conjunction with Atlas at all.

coveralls · 2015-03-17T18:45:33Z

Coverage remained the same at 100.0% when pulling 0ef6ae9 on shanesveller:consul-0.5.0 into fca59ae on johnbellone:master.

donaldguy · 2015-03-17T19:36:18Z

So upon further review, this is a case of user error, but I suspect I won't be the only one.

the TL;DR is I didn't use the service_mode cluster cause I wasn't clear I was supposed to, because I didn't see the Getting Started section all the way at the bottom of the README (I saw and didn't want to use the chef-provisioning script under Usage, and I didn't think to look for more setup instructions after all the LWRP examples).

Maybe the fix is just putting it higher in the README or linking to it under Usage. It may also be making the cluster and server values synonymous, cause I am unconvinced server in its current incarnation is ever useful.

(EDIT: I guess if you are still using the legacy manual bootstrap then you'd still want server without bootstrap_expect ... but Hashicorp no longer recommends this and as it stands the recipe will silently drop a manually set bootstrap_expect with service_mode == 'server' - it should probably issue a warning at least. I opened a new issue to discuss this here #149 )

While I noticed cluster as an value checked for service_mode in _service.rb, because there was nothing in the Consul docs about a cluster mode, I assumed it was just for the chef-provisioning recipe and/or something legacy. I understood the one-only semantics of bootstrap, so server struck me as the correct value for that attribute

But what happens with this value and Atlas is confusing, when I first spun up, consul members showed my three servers and things looked good, but doing curl localhost:8500/v1/status/leader returned "" and the Atlas consul UI would not update (In an email exchange with Hashicorp folks, they say they'll probably add a no leader warning to explain that situation. Cause I was just seeing "Your infrastructure is healthy; 0 nodes".)

So the problems were/are three fold:

I wasn't initially aware of the necessity bootstrap_expect parameter; this is on me.
Using service_mode of server does not populate either the default nor a custom bootstrap_expect parameter through to the underlying config, causing no autoclustering to ever occur. I am not sure why this would ever be desirable.
There is, however, written still a start_join or retry_join block in the config. Since I didn't use the chef-provisioning script or manually populate servers this is blank. This seems minimally aesthetically redundant when using the Atlas integration, and at worst having either a blank or outdated values here could prevent autoclustering from working in the event of a leader failure. I'd need to do more testing to determine which conclusively

so @shanesveller you can probably safely ignore my branch, but I will still suggest that, since Atlas seems to make the retry_join/start_join configs unnecessary that perhaps lines of the form

service_config[join_mode] = node['consul']['servers']

should be

service_config[join_mode] = node['consul']['servers'] unless node.consul.atlas_autojoin

jjhuff · 2015-04-16T22:19:53Z

Hey, right now, atlas_cluster is required, but consul doesn't seem to require it.

It seems that auto-join and the SCADA/infrastructure stuff are two separate things (both require atlas_token). It'd be nice to be able to enable auto_join without SCADA.

johnbellone · 2015-06-16T11:13:21Z

#126 is about to land on master. Are these attributes exposed via configuration?

Add support for Consul 0.5.0 and Atlas auto-join

lock · 2020-04-25T09:56:08Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

kevinreedy referenced this pull request Feb 24, 2015

migrate ui recipe to use libarchive pattern

9c0ce35

johnbellone added the enhancement label Feb 26, 2015

johnbellone reviewed Feb 26, 2015
View reviewed changes

logankoester mentioned this pull request Mar 14, 2015

Adds Consul 0.5.0 checksums to default attributes #144

Closed

johnbellone added this to the 0.9.0 release milestone Mar 14, 2015

donaldguy mentioned this pull request Mar 17, 2015

server v cluster semantics unclear to new user / "Getting Started" under-discoverable #149

Closed

johnbellone removed this from the 0.9.0 release milestone May 1, 2015

Add simple Atlas support

91037e3

johnbellone added a commit that referenced this pull request Jun 16, 2015

Merge pull request #135 from shanesveller/consul-0.5.0

10d582a

Add support for Consul 0.5.0 and Atlas auto-join

johnbellone merged commit 10d582a into sous-chefs:master Jun 16, 2015

jedineeper mentioned this pull request Jul 2, 2015

use node array to refer to variables #189

Merged

lock bot locked as resolved and limited conversation to collaborators Apr 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Consul 0.5.0 and Atlas auto-join #135

Add support for Consul 0.5.0 and Atlas auto-join #135

shanesveller commented Feb 24, 2015

coveralls commented Feb 24, 2015

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

donaldguy commented Mar 10, 2015

johnbellone commented Mar 14, 2015

shanesveller commented Mar 16, 2015

coveralls commented Mar 16, 2015

coveralls commented Mar 16, 2015

donaldguy commented Mar 17, 2015

shanesveller commented Mar 17, 2015

coveralls commented Mar 17, 2015

donaldguy commented Mar 17, 2015

jjhuff commented Apr 16, 2015

johnbellone commented Jun 16, 2015

lock bot commented Apr 25, 2020

Add support for Consul 0.5.0 and Atlas auto-join #135

Add support for Consul 0.5.0 and Atlas auto-join #135

Conversation

shanesveller commented Feb 24, 2015

coveralls commented Feb 24, 2015

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

donaldguy commented Mar 10, 2015

johnbellone commented Mar 14, 2015

shanesveller commented Mar 16, 2015

coveralls commented Mar 16, 2015

coveralls commented Mar 16, 2015

donaldguy commented Mar 17, 2015

shanesveller commented Mar 17, 2015

coveralls commented Mar 17, 2015

donaldguy commented Mar 17, 2015

jjhuff commented Apr 16, 2015

johnbellone commented Jun 16, 2015

lock bot commented Apr 25, 2020