Skip to content
forked from nla/solrbackup

Python script for backing up a remote Solr 4 core or SolrCloud cluster

License

Notifications You must be signed in to change notification settings

uDuCkV/solrbackup

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

solrbackup

Python script for backing up a remote Solr 4 or SolrCloud cluster via the Solr 4 replication protocol.

Usage

Download all the cores in Solr 4 instance:

solrbackup.py http://localhost:8080/solr /tmp/mybackup

Download every shard from a SolrCloud cluster (this might be massive):

solrbackup.py --cloud http://anynode:8080/solr /tmp/mybackup

Run it again on the same directory to incrementally update.

Getting a consistent snapshot

Solrbackup uses the Solr replication protocol to get a consistent snapshot of a particular core. However there's no guarantee that data is consistent across multple cores or shards. If you pass the --reserve option solrbackup will try to snapshot each core at close to the same time, rather than waiting for the previous download to finish.

This is still approximate however. For a fully consistent backup you'll need to pause indexing (in your client applications), send a hard commit, start solrbackup with the --reserve option and then resume indexing (you dont have to wait for the backup to finish). It's a good idea to trigger backups of any other application state that you need the index to be consistent with (such as an SQL database) during the same window.

TODO

  • Option to download from cloud in parallel
  • Retry upon failure
  • Allow specifying a Zookeeper cluster rather than a URL for cloud downloads

About

Python script for backing up a remote Solr 4 core or SolrCloud cluster

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%