Backing up and restoring AE

You may choose to back up AE regularly, based on your organization’s disaster recovery policies.

Warning

You can start the backup process while users are working; however, the backup process will not capture any open sessions or deployments. We therefore recommend that you ask all users to save their work, stop any sessions and deployments, and log out of the platform during the upgrade window if they’d like that content backed up.


This topic provides guidance on the following actions:


Backing up

For the backup, there are just two bash scripts but they require jq so you would have to do a conda install so you get the dependencies. We recommend installing miniconda under /opt/anaconda to allow all AE admins to access it. This is helpful in case admins change.

Next, follow the steps in the section below if AE-master has internet access, or follow the steps under AE-master is airgapped if it’s airgapped.

AE-master has internet access

  1. Install the package ae5_backup_restore from anaconda.org as shown below:

    conda create -n ae5-backup -c ae5-admin ae5_backup_restore
    conda activate ae5-backup
    
  2. Proceed to Verify the installation below.

AE-master is airgapped

If you are air gapped, follow the below steps to download the installer and install:

  1. Download and run the ae5_backup_restore installer.

  2. If you already have miniconda installed, you can just install in an environment as follows. This will install the backup/restore scripts and the jq package in a conda environment:

    # These commands must be run from the master node
    chmod +x ./ae5_backup_restore-x.x.x-Linux-x86_64.sh
    ./ae5_backup_restore-x.x.x-Linux-x86_64.sh -p $CONDA_INSTALL_PREFIX/envs/ae5_backup_restore
    
  3. Proceed to Verify the installation below.

Verify the installation

Get usage help by running the following:

ae_backup.sh -h

Next, run the backup script.

Run the backup script

The following will create the m_backup directory if it does not already exist. Ensure you have sudo access before running, as the script needs it to copy files from /opt/anaconda/storage:

ae_backup.sh ./m_backup

The output will look like the following:

(backup-test) [email protected]:/opt/anaconda$ ae_backup.sh ./m_backup
Backup AE5 to /opt/anaconda/m_backup

Backup secrets/cm/DB dump to /opt/anaconda/m_backup/ae5_config_db_202005201802.tar.gz
Backup data from /opt/anaconda/storage to /opt/anaconda/m_backup/ae5_data_202005201802.tar.gz;
excludes /opt/anaconda/storage/pgdata and /opt/anaconda/storage/object/anaconda-repository of conda packages

dump configmap (ns=default)
dump secrets of type=Opaque (ns=default)
dump custom secrets in (kube-system ns)
backup ingress - used only if swapping HOSTNAME on a running v5 system
dump DB from pod: anaconda-enterprise-postgres-7867898fbf-d5l9w
/opt/anaconda/m_backup /opt/anaconda
/opt/anaconda
/opt/anaconda/storage /opt/anaconda
/opt/anaconda

Note

Conda packages are not included in the backup. It is your responsibility to make a backup. These are stored in /opt/anaconda/storage/object/anaconda-repository.


Restore data from backup

If you backed up your Anaconda Enterprise installation, you can follow the steps below to restore the installation along with your config/DB.

Warning

This will delete user sessions from the database (DB).

  1. Stop the DB. Rename the working /opt/anaconda/storage folder so the data remains intact. Create new /opt/anaconda/storage folder (ensuring permissions / owner are the same as the backup). Scale the DB back up and restart all application pods (anaconda-enterprise-ap):

    kubectl scale --replicas=0 deploy anaconda-enterprise-postgres
    # wait for postgres pod to terminate - kubectl get po |grep postgres
    anaconda-enterprise-postgres-7867898fbf-d5l9w                  1/1     Terminating   10         7d
    
    mv /opt/anaconda/storage /opt/anaconda/storage.working
    # create new /opt/anaconda/storage folder (ensuring permissions / owner are the same as the backup)
    kubectl scale --replicas=1 deploy anaconda-enterprise-postgres
    
    kubectl get po | grep ap- | awk '{print $1}'| xargs kubectl delete pod
    
  2. Once all the pods are back up, log in using default anaconda-enterprise user. Since your LDAP configs are all in the /opt/anaconda/storage.working location, LDAP login will not work. A new /opt/anaconda/storage is created that doesn’t contain any of your data/customizations.

  3. Now let’s restore your config/DB as well as the data. ae_restore.sh uses sudo while copying the data back; ideally passwordless sudo should be setup for the account.

  4. Continuing from the backup example, restore config/DB as well as the data:

    $ ae_restore.sh m_backup/ae5_config_db_202005201802.tar.gz m_backup/ae5_data_202005201802.tar.gz
    
    Restore secrets from m_backup/ae5_config_db_202005201802.tar.gz; replace ingress: false
    Restore AE5 data from m_backup/ae5_data_202005201802.tar.gz, to /opt/anaconda/storage. Backup current data to /opt/anaconda/storage.202005201913.
    
    Are you sure? y
    .. continue
    replace secrets in kube-system ns
    secret/anaconda-enterprise-certs replaced
    secret/cluster-tls replaced
    replace secrets in default ns
    secret "anaconda-config-files" deleted
    secret "anaconda-credentials-anaconda-enterprise-47lgbhtd" deleted
    secret "anaconda-credentials-user-creds-anaconda-enterprise-3ggji6dp" deleted
    secret "anaconda-enterprise-certs" deleted
    secret "anaconda-enterprise-keycloak" deleted
    secret/anaconda-config-files created
    secret/anaconda-credentials-anaconda-enterprise-47lgbhtd created
    secret/anaconda-credentials-user-creds-anaconda-enterprise-3ggji6dp created
    secret/anaconda-enterprise-certs created
    secret/anaconda-enterprise-keycloak created
    extract and replace configmap in default namespace
    configmap "anaconda-enterprise-anaconda-platform.yml" deleted
    configmap "anaconda-enterprise-env-var-config" deleted
    configmap "anaconda-enterprise-install" deleted
    configmap "anaconda-enterprise-nginx-config" deleted
    configmap "docs-nginx-config" deleted
    configmap/anaconda-enterprise-anaconda-platform.yml created
    configmap/anaconda-enterprise-env-var-config created
    configmap/anaconda-enterprise-install created
    configmap/anaconda-enterprise-nginx-config created
    configmap/docs-nginx-config created
    scale DB to 0 replicas and wait for it to terminate
    deployment.extensions/anaconda-enterprise-postgres scaled
    waiting for postgres pod to stop
    waiting for postgres pod to stop
    waiting for postgres pod to stop
    waiting for postgres pod to stop
    waiting for postgres pod to stop
    backup current storage
    restore m_backup/ae5_data_202005201802.tar.gz
    scale DB to 1 replica and wait for it to come up
    deployment.extensions/anaconda-enterprise-postgres scaled
    waiting for postgres pod to become ready
    waiting for postgres pod to become ready
    waiting for postgres pod to become ready
    waiting for postgres pod to become ready
    waiting for postgres pod to become ready
    waiting for postgres pod to become ready
    waiting for postgres pod to become ready
    waiting for postgres pod to become ready
    waiting for postgres pod to become ready
    restore full_postgres_backup.sql
    Import DB into anaconda-enterprise-postgres-7867898fbf-x8hcn from full_postgres_backup.sql
    ERROR:  database "anaconda_auth" does not exist
    ERROR:  database "anaconda_auth_escrow" does not exist
    ERROR:  database "anaconda_deploy" does not exist
    ERROR:  database "anaconda_git" does not exist
    ERROR:  database "anaconda_operation_controller" does not exist
    ERROR:  database "anaconda_repository" does not exist
    ERROR:  database "anaconda_storage" does not exist
    ERROR:  database "anaconda_ui" does not exist
    ERROR:  database "anaconda_workspace" does not exist
    ERROR:  current user cannot be dropped
    ERROR:  role "postgres" already exists
    clean up old sessions from the DB
    DELETE 0
    Unsure what to do w/ deployments so these may need to be restarted
    restart all pods in default namespace
    pod "anaconda-enterprise-ap-auth-77ccfb68b-jb9jp" deleted
    pod "anaconda-enterprise-ap-auth-api-9f4474cc9-7bttr" deleted
    pod "anaconda-enterprise-ap-auth-escrow-f97b84696-84k57" deleted
    pod "anaconda-enterprise-ap-deploy-59d45dbb5c-7xkhx" deleted
    pod "anaconda-enterprise-ap-docs-6c4ccf48b6-f99zr" deleted
    pod "anaconda-enterprise-ap-git-storage-cf855ddc9-f8xmf" deleted
    pod "anaconda-enterprise-ap-object-storage-cc7bf4b99-d8xm4" deleted
    pod "anaconda-enterprise-ap-operation-controller-7ffc6c657d-w4xlg" deleted
    pod "anaconda-enterprise-ap-repository-685c685c76-s6xx8" deleted
    pod "anaconda-enterprise-ap-storage-7b854f5f57-dp2ht" deleted
    pod "anaconda-enterprise-ap-ui-7c8f8cfd49-75mqj" deleted
    pod "anaconda-enterprise-ap-workspace-74ff4cdffc-hbtjc" deleted
    pod "anaconda-enterprise-app-images-zttpm" deleted
    pod "anaconda-enterprise-nginx-ingress-rc-fpght" deleted
    pod "anaconda-enterprise-operation-images-k4t4l" deleted
    pod "anaconda-enterprise-postgres-7867898fbf-x8hcn" deleted
    pod "anaconda-enterprise-redis-755489b7fd-8px5c" deleted
    
  5. Wait for all pods to be in running state before testing. The ap-ui, ap-repository, ap-auth pods take some time to come back up.

Note

If the data file is not provided, then the DB is not restored; only the secrets/configmaps are restored.

  1. Restore the anaconda-repository of conda packages from the backup created above:

    sudo cp -pr /opt/anaconda/storage.working/object/anaconda-repository/* storage/object/anaconda-repository/
    

Warning

This version cleans out old sessions from the DB; but it doesn’t clean out deployments. You may need to manually terminate deployments since there will be no user sessions or deployments in the real backup/restore scenario.

The psql errors like ERROR:  database "anaconda_auth" does not exist are an artifact of the testing. They result from moving the existing pgdata directory, and then letting it get recreated by scaling the DB pod; the tables do not exist in the way they would when you (re)install then restore. This error can be ignored.