-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QOLDEV-863 Fix solr HA #454
Conversation
- Export to EFS as an archive, not an exploded directory - Import by stopping Solr and wholesale replacing the index, not via replication restore endpoint
return 1 | ||
if [ -f "$SYNC_SNAPSHOT" ]; then | ||
sudo service solr stop | ||
sudo -u solr mkdir $LOCAL_DIR/index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have a -p for safety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, line 80 already does that.
code <<-EOS | ||
rsync -a --delete #{efs_data_dir}/ #{real_data_dir}/ | ||
LATEST_INDEX=`ls -dtr #{efs_data_dir}/data/#{core_name}/data/snapshot.* |tail -1` | ||
rsync $LATEST_INDEX/ #{real_data_dir}/data/#{core_name}/data/index/ | ||
CORE_DATA="#{real_data_dir}/data/#{core_name}/data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how many snapshots do we keep on the efs?
could we move from the full file on efs but use a s3 pointer file to reduce efs costs also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sync script, on export, removes all snapshots except the current one (solr-sync.sh
line 90).
We can probably just drop EFS and use S3 without too much trouble. I didn't do it here because it wasn't needed, but it should be fairly straightforward. We don't use EFS for anything that demands high I/O performance; it's just putting timestamps in heartbeat files, and passing snapshots in the background.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a positive step forward, just wondering why the solr api for backup/restore did not work (was it due to the index being corrupted and not booting?
https://solr.apache.org/guide/6_6/making-and-restoring-backups.html
@@ -52,18 +52,18 @@ function export_snapshot () { | |||
if [ "$REPLICATION_STATUS" != "0" ]; then | |||
return $REPLICATION_STATUS | |||
fi | |||
sudo -u solr sh -c "$LUCENE_CHECK $LOCAL_SNAPSHOT && rsync -a --delete $LOCAL_SNAPSHOT/ $SYNC_SNAPSHOT/" || return 1 | |||
sh -c "$LUCENE_CHECK $LOCAL_SNAPSHOT && sudo -u solr tar --force-local --exclude=write.lock -czf $SYNC_SNAPSHOT -C $LOCAL_SNAPSHOT ." || return 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've forgotten why we did not go with snapshot over backup. i know that backup is a full instead of a partial, but is also more disk/resource intensive.
Are we still running this every 2min or did we slow it down to every 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see any distinction in the docs between snapshot and backup. The commands are just 'backup' and 'restore'.
It runs every 5 minutes.
TODO Use S3 to pass backups, rather than EFS.