Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

[elasticsearch] add master pre stop hook using voting_config_exclusion #108

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions elasticsearch/templates/statefulset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,48 @@ spec:
- name: node.{{ $role }}
value: "{{ $enabled }}"
{{- end }}
{{- if and (eq .Values.roles.master "true") (ge (int .Values.esMajorVersion) 7) }}
lifecycle:
preStop:
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
# Exclude node from voting config to prevent https://github.com/elastic/helm-charts/issues/63
# reference: https://www.elastic.co/guide/en/elasticsearch/reference/7.0/modules-discovery-adding-removing-nodes.html#modules-discovery-removing-nodes
http () {
local path="${1}"
local method="${2:-GET}"
if [ -n "${ELASTIC_USERNAME}" ] && [ -n "${ELASTIC_PASSWORD}" ]; then
BASIC_AUTH="-u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}"
else
BASIC_AUTH=''
fi
curl -X${method} -s -k --fail ${BASIC_AUTH} {{ .Values.protocol }}://127.0.0.1:{{ .Values.httpPort }}${path}
}

NODE_NAME=$(awk 'BEGIN {print ENVIRON["node.name"]}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally confused about why awk was needed here. But it looks like bash can't access environment variables with dots in the name.

echo "Exclude the node ${NODE_NAME} from voting config"

http "/_cluster/voting_config_exclusions/${NODE_NAME}" "POST"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be inside the while true loop. Otherwise the set -e is going to cause the preStop script to exit here if this curl request fails.


while true ; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being an infinite loop seems to be ok here because the termination of pods waits for the graceful timeout for the preStop condition to finish: https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.

echo -e "Wait for new master node to be elected"
if [[ "$(http "/_cat/master")" != *"${NODE_NAME}"* ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if statement can be improved to be safer. Right now this is if statement will be matched if the curl command fails in anyway.

[[ "" != "elasticsearch-master-0" ]] ; echo $?
0

A safer option would be to only break if elasticsearch-master is returned and not ${node.name}. Otherwise this will produce a lot of false positives.

break
fi
sleep 1
done

# Node won't be actually removed from cluster,
# so node should be deleted from voting config exclusions
echo "Node ${NODE_NAME} is now being deleted from voting config exclusions"
http "/_cluster/voting_config_exclusions?wait_for_removal=false" "DELETE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be inside the while true loop. If this fails then then this node won't be able to rejoin the cluster after starting up again.


echo "Node ${NODE_NAME} is ready to shutdown"
{{- end }}
{{- if .Values.extraEnvs }}
{{ toYaml .Values.extraEnvs | indent 10 }}
{{- end }}
Expand Down