-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QOLDEV-863 Fix solr HA #454
Changes from all commits
44ad9cb
1530584
3350b87
a7dc383
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,7 +7,7 @@ set -x | |
BACKUP_NAME="$CORE_NAME-$(date +'%Y-%m-%dT%H:%M')" | ||
SNAPSHOT_NAME="snapshot.$BACKUP_NAME" | ||
LOCAL_SNAPSHOT="$LOCAL_DIR/$SNAPSHOT_NAME" | ||
SYNC_SNAPSHOT="$SYNC_DIR/$SNAPSHOT_NAME" | ||
SYNC_SNAPSHOT="$SYNC_DIR/${SNAPSHOT_NAME}.tgz" | ||
MINUTE=$(date +%M) | ||
|
||
function set_dns_primary () { | ||
|
@@ -52,18 +52,18 @@ function export_snapshot () { | |
if [ "$REPLICATION_STATUS" != "0" ]; then | ||
return $REPLICATION_STATUS | ||
fi | ||
sudo -u solr sh -c "$LUCENE_CHECK $LOCAL_SNAPSHOT && rsync -a --delete $LOCAL_SNAPSHOT/ $SYNC_SNAPSHOT/" || return 1 | ||
sh -c "$LUCENE_CHECK $LOCAL_SNAPSHOT && sudo -u solr tar --force-local --exclude=write.lock -czf $SYNC_SNAPSHOT -C $LOCAL_SNAPSHOT ." || return 1 | ||
} | ||
|
||
function import_snapshot () { | ||
# Give the master time to update the sync copy | ||
for i in $(eval echo "{1..40}"); do | ||
if [ -f "$SYNC_SNAPSHOT/write.lock" ]; then | ||
sudo -u solr rm -r $LOCAL_DIR/snapshot.$CORE_NAME-* | ||
sudo -u solr rsync -a --delete "$SYNC_SNAPSHOT/" "$LOCAL_SNAPSHOT/" || exit 1 | ||
rm $LOCAL_SNAPSHOT/write.lock | ||
curl "$HOST/$CORE_NAME/replication?command=restore&location=$LOCAL_DIR&name=$BACKUP_NAME" | ||
return 1 | ||
if [ -f "$SYNC_SNAPSHOT" ]; then | ||
sudo service solr stop | ||
sudo -u solr mkdir $LOCAL_DIR/index | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should we have a -p for safety? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nah, line 80 already does that. |
||
rm $LOCAL_DIR/index/* && sudo -u solr tar -xzf "$SYNC_SNAPSHOT" -C $LOCAL_DIR/index || exit 1 | ||
sudo systemctl start solr | ||
return 0 | ||
else | ||
sleep 5 | ||
fi | ||
|
@@ -100,9 +100,7 @@ if (/usr/local/bin/pick-solr-master.sh); then | |
|
||
# Hourly backup to S3 | ||
if [ "$MINUTE" = "00" ]; then | ||
cd "$LOCAL_DIR" | ||
tar --force-local -czf "$SNAPSHOT_NAME.tgz" "$SNAPSHOT_NAME" | ||
aws s3 mv "$SNAPSHOT_NAME.tgz" "s3://$BUCKET/solr_backup/$CORE_NAME/" --expires $(date -d '30 days' --iso-8601=seconds) | ||
aws s3 cp "$SYNC_SNAPSHOT" "s3://$BUCKET/solr_backup/$CORE_NAME/" --expires $(date -d '30 days' --iso-8601=seconds) | ||
fi | ||
else | ||
# make traffic come to this instance only as a backup option | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -267,10 +267,15 @@ | |
action [:stop] | ||
end | ||
bash "Copy latest index from EFS" do | ||
user account_name | ||
code <<-EOS | ||
rsync -a --delete #{efs_data_dir}/ #{real_data_dir}/ | ||
LATEST_INDEX=`ls -dtr #{efs_data_dir}/data/#{core_name}/data/snapshot.* |tail -1` | ||
rsync $LATEST_INDEX/ #{real_data_dir}/data/#{core_name}/data/index/ | ||
CORE_DATA="#{real_data_dir}/data/#{core_name}/data" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how many snapshots do we keep on the efs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The sync script, on export, removes all snapshots except the current one ( We can probably just drop EFS and use S3 without too much trouble. I didn't do it here because it wasn't needed, but it should be fairly straightforward. We don't use EFS for anything that demands high I/O performance; it's just putting timestamps in heartbeat files, and passing snapshots in the background. |
||
LATEST_INDEX=`ls -dtr $CORE_DATA/snapshot.* |tail -1` | ||
if (echo "$LATEST_INDEX" |grep "[.]tgz$" >/dev/null 2>&1); then | ||
mkdir -p "$CORE_DATA/index" | ||
rm -f $CORE_DATA/index/*; tar -xzf "$LATEST_INDEX" -C $CORE_DATA/index | ||
fi | ||
EOS | ||
only_if { ::File.directory? efs_data_dir } | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've forgotten why we did not go with snapshot over backup. i know that backup is a full instead of a partial, but is also more disk/resource intensive.
Are we still running this every 2min or did we slow it down to every 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see any distinction in the docs between snapshot and backup. The commands are just 'backup' and 'restore'.
It runs every 5 minutes.