Solr backup in kubernetes

Backup is most crucial thing. If any bad things happen then backup is the only way we can recover our data, hence we should setup it first.

Solr has two ways of taking backup. If you are using solr in standalone mode you will use the replication handler. If you are using solrcloud then you will use the Collections API. I have used bitnami helm chart to deploy solr in kubernetes cluster. Bitnami installs solr in solrcloud mode by default. Hint: check values.yaml

101 ## Enable Solr cloud
102 ##
103 cloudEnabled: true

Here is the simple example how we can backup solrcloud

Backup

http://<host>:<port>/solr/admin/collections?action=BACKUP&name=<backup-name>&collection=<collection-name>&location=<backup-location>

Restore

http://<host>:<port>/solr/admin/collections?action=RESTORE&name=<backup-name>&location=<backup-location>&collection=<collection-name>

Problem

Backup location

The first problem is that solr will not allow you to backup in any other location. You will get the following error

SolrException: Path * must be relative to SOLR_HOME, SOLR_DATA_HOME coreRootDirectory. Set system property ‘solr.allowPaths’ to add other allowed paths.”},

To solve this, you can create a common folder in all replicas. In bitnami, it usage /opt/bitnami/solr/server/ as SOLR_HOME, hence you can create a folder called backup. But this approach is not good as the backup will be created randomly in one of the replicas pod.

Second way of solving above error is by adding the shared location where we want to take the backup. This can be provided by usng jvm parameter allowPaths. Suppose, we want to backup at location /backup, then in bitnami values.yaml we can add

extraEnvVars:
  - name: SOLR_OPTS
    value: "-Dsolr.allowPaths=/backup"

Here, /backup is some shared path by all 3 replicas. In GKE, we can provision a storage and use it as pv.

Create storage

Create a 10G storage for our backup purpose.

gcloud compute disks create --size=10GB --zone=us-central1-a solrbackup

Make sure to change the zone according to your configuration.

We can then use it to provision nfs share. Create a kubernetes deployment and service, save the following in nfs.yaml file

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nfs-server
spec:
  replicas: 1
  selector:
    matchLabels:
      role: nfs-server
  template:
    metadata:
      labels:
        role: nfs-server
    spec:
      containers:
      - name: nfs-server
        image: gcr.io/google_containers/volume-nfs:0.8
        securityContext:
          privileged: true
        ports:
          - name: nfs
            containerPort: 2049
          - name: mountd
            containerPort: 20048
          - name: rpcbind
            containerPort: 111
        volumeMounts:
          - mountPath: /data
            name: nfs-pvc
      volumes:
        - name: nfs-pvc
          gcePersistentDisk:
            pdName: solrbackup  # storage name
            fsType: ext4
---
apiVersion: v1
kind: Service
metadata:
  name: nfs-server
spec:
  ports:
    - name: nfs
      port: 2049
    - name: mountd
      port: 20048
    - name: rpcbind
      port: 111
  selector:
    role: nfs-server

Apply it in a new namespace

kubectl create ns solrtest
kubectl apply -f nfs.yaml -n solrtest

This will create a nfs pod

kubectl get po -n solrtest

NAME                          READY   STATUS    RESTARTS   AGE
nfs-server-54747dc7c4-8r5rl   1/1     Running   0          76s

We will now create a pv and pvc for backup. Save the following in backup.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: backup
spec:
  capacity:
    storage: 10G
  accessModes:
    - ReadWriteMany
  nfs:
    server: nfs-server.solrtest.svc.cluster.local
    path: "/"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: backup
spec:
  storageClassName: ""
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10G

Note, here we have used backup claim. We will use this in values.yaml while deploying solr. Now, apply above yaml file

kubectl apply -f backup.yaml -n solrtest

Another problem now is that bitnami usages non-root user to run the container.

185 containerSecurityContext:
186   enabled: true
187   runAsUser: 1001
188   runAsNonRoot: true

If we deploy solr now and try to write to /backup location we will get permission denied issue, it is because /backup is mounted as root user and we are user with UID 1001. Luckily, we can use initcontainer in values.yaml. We will change the ownership of /backup to this UID 1001 which will solve the permission issue.

Open values.yaml and add the following

#initContainers: []
initContainers:
  - name: backup-permission-fix
    image: busybox
    command: ["/bin/chown","-R","1001:1001", "/backup"]
    volumeMounts:
    - name: backup
      mountPath: /backup

Before deploying solr, we need to tell solr to use this extra mount

304 extraVolumes:
305 - name: backup
306   persistentVolumeClaim:
307     claimName: backup
308 
309 ## Extra volume mounts to add to the container
310 ##
311 extraVolumeMounts:
312 - name: backup
313   mountPath: /backup

Deploy the solr

helm upgrade --install solr bitnami/solr -n solrtest -f values.yaml --set authentication.adminPassword=s3cr3tpassw0rd

You should see solr and zookeeper pods running

kubectl get po -n solrtest

NAME                          READY   STATUS    RESTARTS   AGE
nfs-server-54747dc7c4-8r5rl   1/1     Running   0          5m
solr-0                        1/1     Running   0          53s
solr-1                        1/1     Running   0          53s
solr-2                        1/1     Running   0          54s
solr-zookeeper-0              1/1     Running   0          57s
solr-zookeeper-1              1/1     Running   0          57s
solr-zookeeper-2              1/1     Running   0          57s

You can verify the mount by logging to the pods.

kubectl exec -it solr-2 -n solrtest -- bash
I have no [email protected]:/$ df -h

Filesystem                               Size  Used Avail Use% Mounted on
...
nfs-server.solrtest.svc.cluster.local:/   26G  5.2G   21G  21% /backup
...

I have no [email protected]:/$ ls -ld /backup
drwxr-xr-x 2 1001 1001 4096 May 21 06:47 /backup
I have no [email protected]:/$ 

Finally, we can take the backup. Just paste the following

http://<host>:<port>/solr/admin/collections?action=BACKUP&name=myBackupName&collection=<your-collection-name>&location=/backup

You will see the backup inside folder myBackupName

I have no [email protected]:/$ ls /backup/
index.html  myBackupName

I have no [email protected]:/$ ls /backup/myBackupName/
backup.properties  snapshot.shard1  zk_backup

To restore it, remove the corrupted collection and paste the following

http://<host>:<port>/solr/admin/collections?action=RESTORE&name=myBackupName&location=/backup&collection=<name-of-collection-to-restore>&replicationFactor=3

where, replicationFactor is the number of replicas to be created for each shard. In our case it is 3.

We can use cronjob to run the backup. Use the following manifest to create a cronjob. Pass the variables using secret

apiVersion: batch/v1beta1
 kind: CronJob
 metadata:
   name: backup-solr 
   namespace: solrtest
   labels:
     app: backup 
 spec:
   schedule: "* * */3 * *"  
   concurrencyPolicy: "Replace" 
   startingDeadlineSeconds: 200 
   successfulJobsHistoryLimit: 3 
   failedJobsHistoryLimit: 3     
   jobTemplate:             
     spec:
       template:
         metadata:
           labels:          
             parent: "backup"
         spec:
           containers:
           - name: backup 
             image: curlimages/curl 
             command: ["/bin/sh","-c"]
             args:
               - curl -u "$USERNAME:$PASSWORD" "http://$HOST:$PORT/solr/admin/collections?action=BACKUP&name=backup-date -I&collection=$COLLECTION&location=$LOCATION"
           restartPolicy: Never

References:

https://solr.apache.org/guide/6_6/making-and-restoring-backups.html
https://faun.pub/digitalocean-kubernetes-and-volume-permissions-820f46598965
https://stackoverflow.com/questions/50156124/kubernetes-nfs-persistent-volumes-permission-denied
https://solr.apache.org/guide/6_6/making-and-restoring-backups.html
https://medium.com/platformer-blog/nfs-persistent-volumes-with-kubernetes-a-case-study-ce1ed6e2c266

1 comment

  1. Thanks it was helpful, bitnami documentation is not great atleast for new folks, esp on extraVolumes and extraVolumeMounts

Leave a Reply

Your email address will not be published. Required fields are marked *