QOLDEV-863 Fix solr HA #454

ThrawnCA · 2024-08-07T02:21:08Z

Import backups into secondary server(s) by stopping the server and wholesale replacing the index, rather than trying to dynamically import.
Export backups as archives rather than exploded directories.

TODO Use S3 to pass backups, rather than EFS.

- Export to EFS as an archive, not an exploded directory - Import by stopping Solr and wholesale replacing the index, not via replication restore endpoint

…ploded dir

duttonw · 2024-08-07T03:05:30Z

files/default/solr-sync.sh

-      return 1
+    if [ -f "$SYNC_SNAPSHOT" ]; then
+      sudo service solr stop
+      sudo -u solr mkdir $LOCAL_DIR/index


should we have a -p for safety?

Nah, line 80 already does that.

duttonw · 2024-08-07T03:07:11Z

recipes/solr-deploy.rb

    code <<-EOS
        rsync -a --delete #{efs_data_dir}/ #{real_data_dir}/
-        LATEST_INDEX=`ls -dtr #{efs_data_dir}/data/#{core_name}/data/snapshot.* |tail -1`
-        rsync $LATEST_INDEX/ #{real_data_dir}/data/#{core_name}/data/index/
+        CORE_DATA="#{real_data_dir}/data/#{core_name}/data"


how many snapshots do we keep on the efs?
could we move from the full file on efs but use a s3 pointer file to reduce efs costs also?

The sync script, on export, removes all snapshots except the current one (solr-sync.sh line 90).

We can probably just drop EFS and use S3 without too much trouble. I didn't do it here because it wasn't needed, but it should be fairly straightforward. We don't use EFS for anything that demands high I/O performance; it's just putting timestamps in heartbeat files, and passing snapshots in the background.

duttonw

Seems like a positive step forward, just wondering why the solr api for backup/restore did not work (was it due to the index being corrupted and not booting?

https://solr.apache.org/guide/6_6/making-and-restoring-backups.html

duttonw · 2024-08-07T03:10:40Z

files/default/solr-sync.sh

@@ -52,18 +52,18 @@ function export_snapshot () {
  if [ "$REPLICATION_STATUS" != "0" ]; then
    return $REPLICATION_STATUS
  fi
-  sudo -u solr sh -c "$LUCENE_CHECK $LOCAL_SNAPSHOT && rsync -a --delete $LOCAL_SNAPSHOT/ $SYNC_SNAPSHOT/" || return 1
+  sh -c "$LUCENE_CHECK $LOCAL_SNAPSHOT && sudo -u solr tar --force-local --exclude=write.lock -czf $SYNC_SNAPSHOT -C $LOCAL_SNAPSHOT ." || return 1


i've forgotten why we did not go with snapshot over backup. i know that backup is a full instead of a partial, but is also more disk/resource intensive.

Are we still running this every 2min or did we slow it down to every 10?

I can't see any distinction in the docs between snapshot and backup. The commands are just 'backup' and 'restore'.

It runs every 5 minutes.

ThrawnCA added 4 commits August 6, 2024 15:19

[QOLDEV-863] adjust Solr sync approach for more robustness

44ad9cb

- Export to EFS as an archive, not an exploded directory - Import by stopping Solr and wholesale replacing the index, not via replication restore endpoint

[QOLDEV-863] clean up long-obsolete health check files

1530584

[QOLDEV-863] update initial Solr config to grab archive instead of ex…

3350b87

…ploded dir

[QOLDEV-863] use Systemd to start Solr during sync

a7dc383

ThrawnCA requested a review from a team August 7, 2024 02:21

zmacca approved these changes Aug 7, 2024

View reviewed changes

duttonw reviewed Aug 7, 2024

View reviewed changes

duttonw approved these changes Aug 7, 2024

View reviewed changes

duttonw reviewed Aug 7, 2024

View reviewed changes

ThrawnCA merged commit 2b06233 into develop Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QOLDEV-863 Fix solr HA #454

QOLDEV-863 Fix solr HA #454

ThrawnCA commented Aug 7, 2024

duttonw Aug 7, 2024

ThrawnCA Aug 7, 2024

duttonw Aug 7, 2024

ThrawnCA Aug 7, 2024

duttonw left a comment

duttonw Aug 7, 2024

ThrawnCA Aug 7, 2024 •

edited

Loading

QOLDEV-863 Fix solr HA #454

QOLDEV-863 Fix solr HA #454

Conversation

ThrawnCA commented Aug 7, 2024

duttonw Aug 7, 2024

Choose a reason for hiding this comment

ThrawnCA Aug 7, 2024

Choose a reason for hiding this comment

duttonw Aug 7, 2024

Choose a reason for hiding this comment

ThrawnCA Aug 7, 2024

Choose a reason for hiding this comment

duttonw left a comment

Choose a reason for hiding this comment

duttonw Aug 7, 2024

Choose a reason for hiding this comment

ThrawnCA Aug 7, 2024 • edited Loading

Choose a reason for hiding this comment

ThrawnCA Aug 7, 2024 •

edited

Loading