File backups with s3 sync

  • Configure list of paths to include in backup
  • Gzip all files in file with current date time
  • Save gzip archives in /backups/ folder
  • Sync /backups/ folder and
  • Use Cloud Toolbox image with required software installed

Requirements

  • tar
  • s3cmd
  • kubectl

Documentation

  • https://kubernetes.io/docs/reference/kubectl/generated/kubectl_debug/

Options

For read write many storage things are straight forward

  • Copy to local
  • CronJob with volume mount

Unfortunately most Kubernetes clusters don't support read write many (RWX) volume storage.

  • Copy to local
  • Sidecar
  • CronJob & copy to local
  • CronJob & ephemeral container

RW(OX) storage + Copy to local

This copies all the files and then

RWX storage + CronJob & volume mount

This is by far the easiest method, use a CronJob with that attaches the volume that needs to be backed up.

Cons

  • Requires read write many storage (RWX)

Pros

  • Easy
apiVersion: batch/v1
kind: CronJob
metadata:
  name: volume-backup
spec:
  schedule: "0 5 * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: aws-cli
              image: amazon/aws-cli
              env:
                - name: AWS_ACCESS_KEY_ID
                  value: {your_aws_access_key_ID}
                - name: AWS_SECRET_ACCESS_KEY
                  value: {your_aws_secret_access_key}
                - name: AWS_REGION
                  value: eu-central-1
              args:
                - --no-progress
                - --delete
                - s3
                - sync
                - /data
                - s3://your_bucket_name
              volumeMounts:
                - name: backup
                  mountPath: /data
          volumes:
            - name: backup
              persistentVolumeClaim:
                claimName: {your_pvc_name}
          restartPolicy: OnFailure
      ttlSecondsAfterFinished: 172800

RWO storage + sidecar

Cons

  • Doesn't use kubernetes CronJobs

Pros

  • Easy to extend and customize

Long running s3 sync

    - name: backup-data
      image: amazon/aws-cli
      command: ["/bin/sh", "-c", "while true; do aws s3 sync /data s3://<your-bucket-name>/<backup-path> --region <your-region>; sleep 86400; done"]
      volumeMounts:
        - name: data
          mountPath: /data

RWO storage + sidecar & crontab

    - name: backup-data
      image: hosst/s3backup
      volumeMounts:
        - name: data
          mountPath: /data

RWO storage + CronJob & kubectl exec/cp

Cons

  • Requires Kubernetes API access
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup-data
spec:
  schedule: "0 5 * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      template:
        spec:
#          serviceAccountName: backup-data
          containers:
          - name: kubectl
            image: bitnami/kubectl:latest
            env:
              - name: KUBECTL_DEBUG_CUSTOM_PROFILE
                value: true
              - name: NAMESPACE
                value: <namespace>
            command:
            - /bin/sh
            - -c
            - kubectl cp --namespace $(NAMESPACE)
            - zip all files
            - s3cmd all files
          restartPolicy: OnFailure

RWO storage + CronJob & kubectl debug (ephemeral container)

Start a CronJob that runs kubectl debug to the pod with read write only (RWO) volumes to backup.

  • https://kubernetes.io/docs/reference/kubectl/generated/kubectl_debug/
  • https://kubernetes.io/docs/concepts/workloads/pods/ephemeral-containers/

The ConJob image needs kubectl and the debug image needs s3cmd and tar

Cons

  • Requires Kubernetes API access
  • Ephemiral containers are not cleaned up (yet) or reusable
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup-data
spec:
  schedule: "0 5 * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      template:
        spec:
#          serviceAccountName: backup-data
          containers:
          - name: kubectl
            image: bitnami/kubectl:latest
            env:
              - name: KUBECTL_DEBUG_CUSTOM_PROFILE
                value: true
              - name: NAMESPACE
                value: <namespace>
            command:
            - /bin/sh
            - -c
            - kubectl debug --namespace $(NAMESPACE)
          restartPolicy: OnFailure