Following post list down the steps to perform manual disaster recovery of an kubernetes cluster on AWS using Velero.
Pre-requisites: Kubernetes cluster is up and running with one master node and one node acting as worker. Please refer following post to setup kubernetes cluster with kubeadm on AWS free tier.
AWS CLI is installed
Kubernetes Cluster with kubeadm
1) Create s3 bucket to be used for backup.
export BUCKET=velero-backup-bkt
export REGION=ap-southeast-2
aws s3api create-bucket --bucket $BUCKET --region $REGION --create-bucket-configuration LocationConstraint=$REGION 2) Create IAM role to access bucket to be used by Velero
cat > assume-role-policy-document.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
},
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<ACCOUNT_ID>:role/InstanceRole"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
aws iam create-role --role-name velero \
--assume-role-policy-document \
file://assume-role-policy-document.json
cat > velero-trust-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeVolumes",
"ec2:DescribeSnapshots",
"ec2:CreateTags",
"ec2:CreateVolume",
"ec2:CreateSnapshot",
"ec2:DeleteSnapshot"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:DeleteObject",
"s3:PutObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::${BUCKET}/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::${BUCKET}"
]
}
]
}
EOF
aws iam put-role-policy \
--role-name velero \
--policy-name s3 \
--policy-document file://velero-trust-policy.json4) Download velero client
wget https://github.com/vmware-tanzu/velero/releases/download/v1.3.2/velero-v1.3.2-linux-amd64.tar.gz
tar -xvf velero-v1.3.2-linux-amd64.tar.gz -C /tmp sudo mv /tmp/velero-v1.3.2-linux-amd64/velero /usr/local/bin
5) Install velero
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.0.1 \
--bucket ${BUCKET} \
--backup-location-config region=${REGION} \
--snapshot-location-config region=${REGION} \
--pod-annotations iam.amazonaws.com/role=arn:aws:iam::<ACCOUNT_ID>:role/velero \
--no-secret
kubectl get all -n velero
6) Install sample NGINX application on worker node that will serve as our test application to perform disaster recovery.
cat > nginx-deployment.yaml <<EOF
apiVersion: v1
kind: Namespace
metadata:
name: nginx-example
labels:
app: nginx
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: nginx-example
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx:1.17.6
name: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
labels:
app: nginx
name: my-nginx
namespace: nginx-example
spec:
ports:
- port: 80
targetPort: 80
selector:
app: nginx
type: LoadBalancer
EOF
kubectl apply -f nginx-deployment.yamlVerify all the resources are up and running
kubectl get all -n nginx-example
7) Create backup either based on app selector or namespace
velero backup create nginx-backup --selector app=nginx
OR
velero backup create nginx-backup --include-namespaces nginx-example
Check the status. Should show completion time once completed successfully.
velero backup describe nginx-backup
To create scheduled backups
velero create schedule daily-backup-at-7am --schedule="0 7 * * *" --include-namespaces nginx-example
8) Check S3 bucket to confirm backup is created:
9) Now to simulate disaster, lets delete nginx-example namespace. This will delete deployment and all running NGINX pods
kubectl delete namespace nginx-example
10) Restore all resources from backup
velero restore create --from-backup nginx-backup
velero restore describe nginx-backup-20200930060821
Once restore completes, exactly same nginx resources should be up and running.
To restore scheduled backup. We can either do it using specific backup or with latest backup created by schedule:
#Specific backup
velero restore create --from-backup <BACKUP_NAME>
# Latest backup from schedule
velero restore create --from-schedule <SCHEDULE-NAME>
✌












