Backup GitLab to AWS S3 Bucket

Backup GitLab to AWS S3 Bucket

You have a GitLab instance on Kubernetes and you are desperately looking for a way to back it up ? Try the Velero utility and push store your backup on S3 buckets.

In this arctile we will show you how we backup an On Prem GitLab running on Kubernetes to a S3 bucket using Velero.

First of all you will need a GitLab service to be backed up :-) If you don't already have one you can deploy it on your Kubernetes cluster using the command below

# Generate a self-signed cert
openssl req -x509 -newkey rsa:4096 -sha256 -nodes -keyout tlsgitlab.key -out tlsgitlab.crt -subj "/CN=gitlab.example.com" -days 365
kubectl -n gitlab create secret tls gitlab-example-com --cert=tlsgitlab.crt --key=tlsgitlab.key

# GitLab Installation via Helm
kubectl create ns gitlab
helm --namespace gitlab install gitlab gitlab/gitlab \
  --set certmanager.install=false \
  --set global.ingress.configureCertmanager=false \
  --set global.ingress.tls.secretName=gitlab-example-com \
  --set global.hosts.domain=example.com \
  --set global.hosts.gitlab.name=gitlab.example.com \
  --set global.hosts.registry.name=registry.example.com \
  --set global.hosts.minio.name=minio.example.com \
  --set gitlab-runner.install=false \
  --set global.minio.enabled=false \
  --set global.appConfig.omniauth.enabled=true \
  --set global.appConfig.omniauth.allowSingleSignOn=true \
  --set global.edition=ce

# Get GitLab web root password
kubectl -n gitlab get secret gitlab-gitlab-initial-root-password -ojsonpath='{.data.password}' | base64 --decode ; echo

# Annotate pods to allow Velero to backup their PV with restic
kubectl annotate pod/gitlab-postgresql-0 backup.velero.io/backup-volumes=data -n gitlab
kubectl annotate pod/gitlab-gitaly-0 backup.velero.io/backup-volumes=repo-data -n gitlab

Now we will prepare the AWS stuff to allow Velero to send and retrieve data.

Define a policy to allow acces to the S3 bucket, pay attention to modify the arn:aws:s3::: value in the code below.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeVolumes",
                "ec2:DescribeSnapshots",
                "ec2:CreateTags",
                "ec2:CreateVolume",
                "ec2:CreateSnapshot",
                "ec2:DeleteSnapshot"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:PutObject",
                "s3:AbortMultipartUpload",
                "s3:ListMultipartUploadParts"
            ],
            "Resource": [
                "arn:aws:s3:::velero-bucket-01/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::velero-bucket-01"
            ]
        }
    ]
}

Then we can create the bucket, create the dedicated user, the policy and link all together. As a feedback you will receive the aws_access_key_id and the aws_secret_access_key. Keep both, we will need them. Modify the code according to your needs.

BUCKET=velero-bucket-01
REGION=eu-west-3
aws s3api create-bucket \
    --bucket $BUCKET \
    --region $REGION \
    --create-bucket-configuration LocationConstraint=$REGION

aws iam create-user --user-name velerouser01

aws iam put-user-policy \
  --user-name velerouser01\
  --policy-name velerouser01\
  --policy-document file://aws-velero-policy.json


aws iam create-access-key --user-name velerouser01

# Create a Velero-specific credentials file named "credentials-velero" in your current directory:
# [default]
# aws_access_key_id=XYZ
# aws_secret_access_key=XYZ

Create a Velero-specific credentials file named credentials-velero in your current directory with the aws_access_key_id and the aws_secret_access_key.

[default]
aws_access_key_id=XYZ
aws_secret_access_key=XYZ

Now that the accesses are created we can deploy Velero for AWS in our Kubernetes cluster.

BUCKET=velero-bucket-01
REGION=eu-west-3
velero install \
    --provider aws \
    --plugins velero/velero-plugin-for-aws:latest \
    --bucket $BUCKET \
    --use-restic \
    --secret-file ./credentials-velero \
    --backup-location-config region=$REGION \
    --snapshot-location-config region=$REGION

Validate that Velero has access to the S3 Bucket

velero get backup

To backup the GitLab Volumes we need to annotate them so Velero will know that he have to backup them using restic.

kubectl annotate pod/gitlab-postgresql-0 backup.velero.io/backup-volumes=data -n gitlab
kubectl annotate pod/gitlab-gitaly-0 backup.velero.io/backup-volumes=repo-data -n gitlab

Backup your GitLab instance to the S3 bucket

velero backup create gitlabbackup01 --include-namespaces gitlab
velero get backup

Try a disaster recovery by deleting the gitlab namespace and then restoring from the backup.

kubectl delete ns gitlab
velero restore create --from-backup  gitlabbackup01

Test if your GitLab instance is running again.

Feel free to comment this article if you have some question.

cisel.ch