thoth-duplicacy

About

This container installs both duplicacy and duplicacy-util on top of debian buster-slim.

Motivation

I wanted to backup up my NAS and ARM cluster to B2 because:

CrashPlan for home is dead
I don't want to pay per-machine for backups, I'd rather just pay for the storage
B2 is super cheap compared to the other options I looked at

Usage

I've tried to make this image as easy to use as possible. It expects to find the directory you want to back up mounted to the container as /data, and that there is a .duplicacy directory with proper settings at the root level of the directory.

Building the image

Change CONTAINER_NAME in the included Rakefile and run rake build_container to build an architecture-specific container yourself. If you want to use the experimental buildx support in Docker to generate a multi-architecture build, run rake buildx instead.

I've had issues with getting my multi-architecture builds to work on ARM though, so I publish separate images for Intel and ARM7 on docker hub as unixorn/thoth-duplicacy:armv7l and unixorn/thoth-duplicacy:x86_64.

Using with docker-compose

If you just want to run it with docker, you can run it with docker-compose. There are a few environment variables that the included backup-cronjob script looks for so it knows how many threads to run and where to write the backups.

BACKUP_LOCATION=/path/to/somewhere docker-compose run thoth-duplicacy backup-cronjob

Using inside Kubernetes

I have an ARM cluster running k3s on some ODROID HC2s and Raspberry Pis. I run moosefs on the cluster as a distributed file system, and wanted to back it up in as simple a way as possible.

By running the backup as a kubernetes cron job, I don't have to maintain a crontab on any of the cluster nodes, I created a cronjob-backup-stuff.yml in my cluster's configuration git repo, and kubernetes will ensure that the backups run on the least-loaded node.

Backing up directories

Here's a sample job that backs up one of my samba volumes. Note that I've created a backups namespace in my cluster to keep things tidy - if you want to pollute your default namespace instead, delete the namespace entry in the metadata section.

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: backup-exampledir
  namespace: backups
spec:
  schedule: "35 */2 * * *"
  jobTemplate:
    spec:
      # Ensure only one copy of the backup is running, even if it runs
      # so long that it is still running when the next cron slot comes up
      concurrencyPolicy: Forbid
      template:
        spec:
          containers:
          - name: backup-exampledir
            # I'm running this on the odroids in my cluster, so I'm specifying
            # the ARM7 build
            image: unixorn/thoth-duplicacy:arm7l
            # Use the x86_64 tag if you're on Intel
            # image: unixorn/thoth-duplicacy:x86_64
            args:
            - /bin/sh
            - -c
            - /usr/local/bin/backup-cronjob

            volumeMounts:
              - name: data-volume
                mountPath: /data/

            env:
                # I want to restrict the number of threads used for uploads
                # so that duplicacy doesn't consume all my upload bandwidth.
                # I don't care if backups are slow.
              - name: DUPLICACY_BACKUP_THEAD_COUNT
                value: "3"
                # backup-cronjob needs to know what defined storage to back up
                # files to.
              - name: B2_STORAGE_NAME
                value: "b2"

          restartPolicy: OnFailure

          # Keep it running on a chunkserver so that at least part of the
          # I/O is to local disk instead of across the network. Remove if
          # you don't care what node backups happen on.
          nodeSelector:
            odroid: "true"

          volumes:
            - name: data-volume
              hostPath:
                # This will be remapped to /data which is where duplicacy
                # expects to find the data it is backing up, and the .duplicacy
                # directory with its settings.
                path: /dfs/volumes/exampledir
                # this field is optional
                type: Directory

Pruning snapshots

We don't want to keep snapshots forever. Here's a Kubernetes cronjob to clean them up.

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: purge-stale-duplicacy-snapshots
  namespace: backups
spec:
  schedule: "48 */3 * * *"
  jobTemplate:
    spec:
      concurrencyPolicy: Forbid
      template:
        spec:
          containers:
          - name: purge-stale-duplicacy-snapshots
            # I'm running this on the odroids in my cluster, so I'm specifying
            # the ARM7 build
            image: unixorn/thoth-duplicacy:arm7l
            # Use the x86_64 tag if you're on Intel
            # image: unixorn/thoth-duplicacy:x86_64

            # Make sure we run inside /data so that duplicacy can find
            # the configuration directory.
            workingDir: /data

            # Remember that the -keep arguments must be listed from longest
            # time frame to shortest, otherwise the disordered ones will be
            # ignored, which could mean deleting snapshots you want to keep.
            #
            # I'm specifying to keep no snapshots more than 365 days old, keep
            # a single snapshot every 30 days for snapshots older than 90 days,
            # a single snapshot a week for snapshots older than 30 days, and 
            # finally keep only a single snapshot per day for snapshots
            # older than 2 days.
            #
            # Also note that the duplicacy verb has to come before any of the
            # settings command line options.
            args:
            - duplicacy
            - prune
            - -storage
            - b2
            - -all
            - -keep 0:365
            - -keep 30:90
            - -keep 7:30
            - -keep 1:2
            - -exhaustive

            volumeMounts:
              - name: data-volume
                mountPath: /data/

            env:
              - name: DUPLICACY_BACKUP_THEAD_COUNT
                value: "3"
              - name: B2_STORAGE_NAME
                value: "b2"

          restartPolicy: OnFailure

          # Keep it running on one of the cluster chunkservers so that at
          # least part of the I/O is to local disk instead of across the
          # network. Remove if you don't care what node backups happen on.
          nodeSelector:
            odroid: "true"

          volumes:
            - name: data-volume
              hostPath:
                # This will be remapped to /data which is where duplicacy
                # expects to find the data it is backing up, and the .duplicacy
                # directory with its settings.
                path: /dfs/volumes/exampledir
                # this field is optional
                type: Directory

License

duplicacy and duplicacy-util have their own licenses, see the links above for details

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
bin		bin
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Rakefile		Rakefile
Readme.md		Readme.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

thoth-duplicacy

About

Motivation

Usage

Building the image

Using with docker-compose

Using inside Kubernetes

Backing up directories

Pruning snapshots

License

About

Releases

Packages

Languages

License

unixorn/thoth-duplicacy

Folders and files

Latest commit

History

Repository files navigation

thoth-duplicacy

About

Motivation

Usage

Building the image

Using with docker-compose

Using inside Kubernetes

Backing up directories

Pruning snapshots

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages