Backing up and restoring AE¶
You may choose to back up AE regularly, based on your organization’s disaster recovery policies.
Warning
You can start the backup process while users are working; however, the backup process will not capture any open sessions or deployments. We therefore recommend that you ask all users to save their work, stop any sessions and deployments, and log out of the platform during the upgrade window if they’d like that content backed up.
This topic provides guidance on the following actions:
Backing up¶
For the backup, there are just two bash scripts but they require jq
so you would have to do a conda install so you get the dependencies.
We recommend installing miniconda under /opt/anaconda
to allow all AE admins to access it. This is helpful in case admins change.
Next, follow the steps in the section below if AE-master has internet access, or follow the steps under AE-master is airgapped if it’s airgapped.
AE-master has internet access¶
Install the package
ae5_backup_restore
from anaconda.org as shown below:conda create -n ae5-backup -c ae5-admin ae5_backup_restore conda activate ae5-backup
Proceed to Verify the installation below.
AE-master is airgapped¶
If you are air gapped, follow the below steps to download the installer and install:
Download and run the ae5_backup_restore installer.
If you already have miniconda installed, you can just install in an environment as follows. This will install the backup/restore scripts and the
jq
package in a conda environment:# These commands must be run from the master node chmod +x ./ae5_backup_restore-x.x.x-Linux-x86_64.sh ./ae5_backup_restore-x.x.x-Linux-x86_64.sh -p $CONDA_INSTALL_PREFIX/envs/ae5_backup_restore
Proceed to Verify the installation below.
Verify the installation¶
Get usage help by running the following:
ae_backup.sh -h
Next, run the backup script.
Run the backup script¶
The following will create the m_backup
directory if it does not already exist. Ensure you have sudo
access before running, as the script needs it to copy files from /opt/anaconda/storage
:
ae_backup.sh ./m_backup
The output will look like the following:
(backup-test) ubuntu@ip:/opt/anaconda$ ae_backup.sh ./m_backup
Backup AE5 to /opt/anaconda/m_backup
Backup secrets/cm/DB dump to /opt/anaconda/m_backup/ae5_config_db_202005201802.tar.gz
Backup data from /opt/anaconda/storage to /opt/anaconda/m_backup/ae5_data_202005201802.tar.gz;
excludes /opt/anaconda/storage/pgdata and /opt/anaconda/storage/object/anaconda-repository of conda packages
dump configmap (ns=default)
dump secrets of type=Opaque (ns=default)
dump custom secrets in (kube-system ns)
backup ingress - used only if swapping HOSTNAME on a running v5 system
dump DB from pod: anaconda-enterprise-postgres-7867898fbf-d5l9w
/opt/anaconda/m_backup /opt/anaconda
/opt/anaconda
/opt/anaconda/storage /opt/anaconda
/opt/anaconda
Note
Conda packages are not included in the backup. It is your responsibility to make a backup. These are stored in /opt/anaconda/storage/object/anaconda-repository
.
Restore data from backup¶
If you backed up your Anaconda Enterprise installation, you can follow the steps below to restore the installation along with your config/DB.
Warning
This will delete user sessions from the database (DB).
Stop the DB. Rename the working
/opt/anaconda/storage
folder so the data remains intact. Create new/opt/anaconda/storage
folder (ensuring permissions / owner are the same as the backup). Scale the DB back up and restart all application pods (anaconda-enterprise-ap):kubectl scale --replicas=0 deploy anaconda-enterprise-postgres # wait for postgres pod to terminate - kubectl get po |grep postgres anaconda-enterprise-postgres-7867898fbf-d5l9w 1/1 Terminating 10 7d mv /opt/anaconda/storage /opt/anaconda/storage.working # create new /opt/anaconda/storage folder (ensuring permissions / owner are the same as the backup) kubectl scale --replicas=1 deploy anaconda-enterprise-postgres kubectl get po | grep ap- | awk '{print $1}'| xargs kubectl delete pod
Once all the pods are back up, log in using default
anaconda-enterprise
user. Since your LDAP configs are all in the/opt/anaconda/storage.working
location, LDAP login will not work. A new/opt/anaconda/storage
is created that doesn’t contain any of your data/customizations.Now let’s restore your config/DB as well as the data.
ae_restore.sh
usessudo
while copying the data back; ideally passwordless sudo should be setup for the account.Continuing from the backup example, restore config/DB as well as the data:
$ ae_restore.sh m_backup/ae5_config_db_202005201802.tar.gz m_backup/ae5_data_202005201802.tar.gz Restore secrets from m_backup/ae5_config_db_202005201802.tar.gz; replace ingress: false Restore AE5 data from m_backup/ae5_data_202005201802.tar.gz, to /opt/anaconda/storage. Backup current data to /opt/anaconda/storage.202005201913. Are you sure? y .. continue replace secrets in kube-system ns secret/anaconda-enterprise-certs replaced secret/cluster-tls replaced replace secrets in default ns secret "anaconda-config-files" deleted secret "anaconda-credentials-anaconda-enterprise-47lgbhtd" deleted secret "anaconda-credentials-user-creds-anaconda-enterprise-3ggji6dp" deleted secret "anaconda-enterprise-certs" deleted secret "anaconda-enterprise-keycloak" deleted secret/anaconda-config-files created secret/anaconda-credentials-anaconda-enterprise-47lgbhtd created secret/anaconda-credentials-user-creds-anaconda-enterprise-3ggji6dp created secret/anaconda-enterprise-certs created secret/anaconda-enterprise-keycloak created extract and replace configmap in default namespace configmap "anaconda-enterprise-anaconda-platform.yml" deleted configmap "anaconda-enterprise-env-var-config" deleted configmap "anaconda-enterprise-install" deleted configmap "anaconda-enterprise-nginx-config" deleted configmap "docs-nginx-config" deleted configmap/anaconda-enterprise-anaconda-platform.yml created configmap/anaconda-enterprise-env-var-config created configmap/anaconda-enterprise-install created configmap/anaconda-enterprise-nginx-config created configmap/docs-nginx-config created scale DB to 0 replicas and wait for it to terminate deployment.extensions/anaconda-enterprise-postgres scaled waiting for postgres pod to stop waiting for postgres pod to stop waiting for postgres pod to stop waiting for postgres pod to stop waiting for postgres pod to stop backup current storage restore m_backup/ae5_data_202005201802.tar.gz scale DB to 1 replica and wait for it to come up deployment.extensions/anaconda-enterprise-postgres scaled waiting for postgres pod to become ready waiting for postgres pod to become ready waiting for postgres pod to become ready waiting for postgres pod to become ready waiting for postgres pod to become ready waiting for postgres pod to become ready waiting for postgres pod to become ready waiting for postgres pod to become ready waiting for postgres pod to become ready restore full_postgres_backup.sql Import DB into anaconda-enterprise-postgres-7867898fbf-x8hcn from full_postgres_backup.sql ERROR: database "anaconda_auth" does not exist ERROR: database "anaconda_auth_escrow" does not exist ERROR: database "anaconda_deploy" does not exist ERROR: database "anaconda_git" does not exist ERROR: database "anaconda_operation_controller" does not exist ERROR: database "anaconda_repository" does not exist ERROR: database "anaconda_storage" does not exist ERROR: database "anaconda_ui" does not exist ERROR: database "anaconda_workspace" does not exist ERROR: current user cannot be dropped ERROR: role "postgres" already exists clean up old sessions from the DB DELETE 0 Unsure what to do w/ deployments so these may need to be restarted restart all pods in default namespace pod "anaconda-enterprise-ap-auth-77ccfb68b-jb9jp" deleted pod "anaconda-enterprise-ap-auth-api-9f4474cc9-7bttr" deleted pod "anaconda-enterprise-ap-auth-escrow-f97b84696-84k57" deleted pod "anaconda-enterprise-ap-deploy-59d45dbb5c-7xkhx" deleted pod "anaconda-enterprise-ap-docs-6c4ccf48b6-f99zr" deleted pod "anaconda-enterprise-ap-git-storage-cf855ddc9-f8xmf" deleted pod "anaconda-enterprise-ap-object-storage-cc7bf4b99-d8xm4" deleted pod "anaconda-enterprise-ap-operation-controller-7ffc6c657d-w4xlg" deleted pod "anaconda-enterprise-ap-repository-685c685c76-s6xx8" deleted pod "anaconda-enterprise-ap-storage-7b854f5f57-dp2ht" deleted pod "anaconda-enterprise-ap-ui-7c8f8cfd49-75mqj" deleted pod "anaconda-enterprise-ap-workspace-74ff4cdffc-hbtjc" deleted pod "anaconda-enterprise-app-images-zttpm" deleted pod "anaconda-enterprise-nginx-ingress-rc-fpght" deleted pod "anaconda-enterprise-operation-images-k4t4l" deleted pod "anaconda-enterprise-postgres-7867898fbf-x8hcn" deleted pod "anaconda-enterprise-redis-755489b7fd-8px5c" deleted
Wait for all pods to be in running state before testing. The
ap-ui
,ap-repository
,ap-auth
pods take some time to come back up.
Note
If the data file is not provided, then the DB is not restored; only the secrets/configmaps are restored.
Restore the
anaconda-repository
of conda packages from the backup created above:sudo cp -pr /opt/anaconda/storage.working/object/anaconda-repository/* storage/object/anaconda-repository/
Warning
This version cleans out old sessions from the DB; but it doesn’t clean out deployments. You may need to manually terminate deployments since there will be no user sessions or deployments in the real backup/restore scenario.
The psql errors like ERROR: database "anaconda_auth" does not exist
are an artifact of the testing. They result from moving the existing pgdata directory, and then letting it get recreated by scaling the DB pod; the tables do not exist in the way they would when you (re)install then restore. This error can be ignored.