Upgrading between versions of AE5

Due to the potential complexity of your custom configuration, please contact Anaconda Support before initiating the upgrade.

After you have determined the topology for your Anaconda Enterprise cluster, and verified that your system meets all of the installation requirements, you’re ready to upgrade the cluster.

Before you begin:

  • Configure your A record in DNS for the master node with the actual domain name you will use for your Anaconda Enterprise installation.

  • If you are using a firewall for network security, we recommend you temporarily disable it while you upgrade Anaconda Enterprise.

  • When installing Anaconda Enterprise on a system with multiple nodes, verify that the clock of each node is in sync with the others prior to starting the installation process, to avoid potential issues. We recommend using the Network Time Protocol (NTP) to synchronize computer system clocks automatically over a network. See instructions here.

  • Back up the anaconda-enterprise-anaconda-platform.yml file used to configure the platform, as config map settings such as external Git configuration are not automatically migrated to the new cluster as part of the upgrade process.

  • Back up your custom cas-mirror and anaconda-enterprise-cli configurations (see Step 4 below), as $HOME/cas-mirror will be overwritten during the upgrade process. To avoid any compatibility issues, we recommend you upgrade your mirror tools as part of the upgrade process. Afterwards, simply copy over the configuration files you backed up to restore your custom configuration.

Warning

After the upgrade or backup process has begun, it won’t be possible to capture or back up data for any open sessions or deployments. We therefore recommend that you ask all users to save their work, stop any sessions and deployments, and log out of the platform during the upgrade window. The backup.sh script that runs as part of the upgrade process will restart all pods, so if they don’t, they will lose any unsaved work. They may also encounter a 404 error after the upgrade. The workaround for the error message is to stop and restart the session or deployment that generated the error, but there is no way to retrieve lost data.

The upgrade process varies slightly, depending on your current version and which version you’re installing. To update an existing Anaconda Enterprise installation to a newer version, follow the process that corresponds to your particular scenario:


Upgrading from AE 5.2.x or 5.3.0 to 5.3.1

Anaconda Enterprise 5.2.x and 5.3.0 supports in-place upgrades, so you can follow these simple steps to update your 5.2.x or 5.3.0 installation to the latest version.

  1. Ensure that all AE users have closed any open sessions, stopped any deployed applications, and logged out of the platform. The backup.sh script that runs as part of the upgrade process will restart all pods, so if they don’t, they will lose any unsaved work.

  2. On the master node running your current installation of AE, download and decompress the new installer, replacing <location_of_installer> with the location of the installer, and <version> with your installer version:

    curl -O <location_of_installer>.tar.gz
    tar xvzf anaconda-enterprise-<version>.tar.gz
    cd anaconda-enterprise<version>
    
  3. Run the following command to upload the installer to the AE environment:

    sudo ./upload
    
  4. When the upload process finishes, run the following command to start the upgrade process:

    sudo ./gravity upgrade
    
  5. The upgrade process may take up to an hour to complete. You can check the status of the upgrade process by running sudo ./gravity status.

If you encounter errors while upgrading, you can check the status of the operation by running sudo ./gravity plan. You can then roll back any step in the upgrade process by running the rollback command against the name of the phase, as it’s listed in the Phase column:

sudo ./gravity rollback --phase=/<name-of-phase>

After addressing the error(s), you can resume the upgrade by running the following command:

sudo ./gravity upgrade --resume --force

After the upgrade process completes, follow the steps to verify that your upgrade was successful.

After you’ve confirmed that your upgrade was successful—and everything works as expected—you can run a script to remove images leftover from the previous installation and free up space. This will help prevent the cluster from running out of disk space on the master node.

Upgrading from AE 5.1.2 or 5.1.3 to 5.2.x

In-place upgrades from 5.1.x to 5.2.x are not supported. To update an existing Anaconda Enterprise installation to 5.2.x, you’ll need to follow this process.

The specific steps for each stage in the process are outlined in the sections below:

  1. Back up your Anaconda Enterprise configuration and data files.

  2. Uninstall *all* nodes—master and workers.

  3. Install Anaconda Enterprise 5.2.x.

  4. Restore your Anaconda Enterprise configuration and data files from the backup.

  5. Verify that all AE pods are healthy.

  6. Update the Anaconda Enterprise server settings URLs.

  7. Upgrade cas-mirror.

  8. Verify that your installation was successful.


Stage 1 – Back up Anaconda Enterprise

Before you begin any upgrade, you must back up your Anaconda Enterprise configuration and data files. The number of channels and packages being backed up will impact the amount of free space and time required to perform the backup, so ensure you have sufficient free space and time available to complete the process. If the space available is insufficient, you’ll encounter disk pressure issues.

Note

After installing AE 5.2.x, you’ll need to re-configure your SSL certificates, so ensure all certificate-related information—including the private key—is accessible at that point in the process. You’ll also need to re-create all Operations Center Admin users, as they aren’t preserved as part of the backup process.

All of the following commands should be run on the master node.

  1. Copy the backup.sh script from the location where you saved the installer tarball to the Anaconda Enterprise environment using the following command:

    sudo cp backup.sh /opt/anaconda
    
  2. Back up Anaconda Enterprise by running the following commands:

    sudo gravity enter
    cd /opt/anaconda
    bash backup.sh
    

The following backup files are created and saved to /opt/anaconda:

ae5-data-backup-${timestamp}.tar
ae5-state-backup-${timestamp}.tar.gz
  1. Move the backup files to a remote location to preserve them, as the /opt/anaconda directory will be deleted in future steps. After uninstalling AE, you’ll copy ae5-data-backup-${timestamp}.tar back to your local filesystem.

  2. Back up your custom cas-mirror and anaconda-enterprise-cli configurations.

    The default location for cas-mirror configuration is $HOME/cas-mirror/etc/anaconda-platform/mirrors, and may include multiple files such as anaconda.yaml and r.yaml.

    The default location for anaconda-enterprise-cli configuration is $HOME/cas-mirror/lib/python3.5/site-packages/anaconda_enterprise/cli/schemas/v1/defaults/cli.yml or $HOME/.anaconda/anaconda-platform/cli.yml.

    Copy the configuration files to a remote location to preserve them, as $HOME/cas-mirror will be overwritten in future steps. After that, you will copy back the configuration files.

  3. Exit the Anaconda Enterprise environment by typing exit.

If your existing configuration includes Spark/Hadoop, perform these additional steps to migrate configuration information specific to your cluster:

  1. Run the following command to retrieve configuration info. from the 5.1.x server, and generate the anaconda-config-files-secret.yaml file:

    kubectl get secret anaconda-config-files -o yaml > <path-to-anaconda-config-files-secret.yaml>
    
  2. Move this file to a remote location to preserve it, as it will be deleted in future steps. Ensure that you can access this file from the server where you’re installing the newer version of AE 5.2.x.

  3. Open the anaconda-config-files-secret.yaml file, locate the metadata section, and delete everything under it except for the following: name: anaconda-config-file.

For example, if it looks like this to begin with:

apiVersion: v1
data:
  xxxx
kind: Secret
metadata:
  creationTimestamp: 2018-07-31T19:30:54Z
  name: anaconda-config-files
  namespace: default
  resourceVersion: "981426"
  selfLink: /api/v1/namespaces/default/secrets/anaconda-config-files
  uid: 3de10e2b-94f8-11e8-94b8-1223fab00076
type: Opaque

It will look like this afterwards:

apiVersion: v1
data:
  xxxx
kind: Secret
metadata:
  name: anaconda-config-files
type: Opaque

Stage 2 – Uninstall Anaconda Enterprise

  1. Uninstall all worker nodes, by running the following commands from a shell on each worker node:

    sudo gravity leave --force
    sudo killall gravity
    sudo killall planet
    
  2. Now you can uninstall the master node, by running the following commands:

    sudo gravity system uninstall
    sudo killall gravity
    sudo killall planet
    sudo rm -rf /var/lib/gravity /opt/anaconda
    
  3. Reboot all nodes to ensure that any Anaconda Enterprise state is flushed from your system.


Stage 3 – Install Anaconda Enterprise

Warning

You must use the same FQDN used in your 5.1.x installation for your 5.2.x installation.

  1. Download the latest installer file using the following command:

    curl -O <link-to-installer>
    
  2. Follow the installation instructions—including the post-install configuration steps—to install Anaconda Enterprise.

If you encounter any errors, refer to Installation requirements for the instructions to update your environment so that it meets the AE installation requirements, and restart the install.

Note

After ensuring your environment meets all requirements, if you still see a Cannot continue error, restart the install.

  1. Before restoring your data, log in to the platform to verify that the installation completed successfully, then log out again.


Stage 4 – Restore data

Copy the restore.sh script from the location where you saved the installer tarball to the Anaconda Enterprise environment using the following command:

sudo cp restore.sh /opt/anaconda

When upgrading from 5.1.x to 5.2.x, we recommend restoring backup data only, as new state information will be generated during the installation of 5.2.x.

In the terminal, run the following commands:

sudo gravity enter
cd /opt/anaconda/
bash restore.sh <path-to-data-backup-file>

Note

Replace path-to-data-backup-file with the path to the data backup file generated when you ran the Anaconda Enterprise backup script in step 1 of Stage 1 above.

For help, run the bash restore.sh -h command.


Stage 5 – Verify pod status

Restoring AE data deletes and restarts the associated pods. Follow this process to ensure the new pods are healthy:

  1. Log in to the Operations Center using the Administrator credentials configured after installation.

  2. Select Kubernetes in the left menu and click on the Pods tab.

  3. Verify that the Status of all pods says Running.

You can also use the following command to watch for status updates:

watch kubectl get pods --all-namespaces

Note

It may take up to 15 minutes for their status to change from Pending to Running.


Stage 6 – Edit redirect URLs

  1. When all pods are running, access the Anaconda Enterprise Authentication Center by visiting this URL in your browser: https://example.anaconda.com/auth/—replacing example.anaconda.com with the FQDN of your server—and clicking the Administration Console link.

  2. Login with the same username and password you used for the 5.1.x Authentication Center.


../../_images/ops-center-realm.png

  1. Verify that AnacondaPlatform is displayed as the current realm, then select Clients from the Configure menu on the left.


../../_images/ops-center-clients1.png

  1. In the Client list, click anaconda-platform to display the platform settings.

  2. On the Settings tab, update all URLs in the following fields with the FQDN of the Anaconda Enterprise server, or the following symbols:


../../_images/platform-redirect-urls1.png

Note

If you provide the FQDN of your AE server, be sure each field still ends with the symbols shown. For example, the Valid Redirect URIs would look something like this: https://server-name.domain.com/*


  1. Click Save to update the server with your changes.


Stage 7 – Upgrade cas-mirror

  1. This step will overwrite the contents of $HOME/cas-mirror. Be sure that you followed the earlier instructions to back up the configuration files.

  2. Go to the Anaconda Enterprise installer decompressed tarball folder and run the script to upgrade both cas-mirror and anaconda-enterprise-cli:

    cd anaconda-enterprise-<version>
    ./cas_mirror-<version>-linux-64.sh -u
    

    Replace <version> with the version number.

  3. Restore the custom cas-mirror and anaconda-enterprise-cli configuration files that you backed up earlier.

    The default location for cas-mirror configuration is $HOME/cas-mirror/etc/anaconda-platform/mirrors, and may include multiple files such as anaconda.yaml and r.yaml.

    The default location for “anaconda-enterprise-cli” configuration is $HOME/cas-mirror/lib/python3.5/site-packages/anaconda_enterprise/cli/schemas/v1/defaults/cli.yml or $HOME/.anaconda/anaconda-platform/cli.yml.


Stage 8 – Verify installation

After you’ve verified that all pods are running and updated the Anaconda Enterprise URLs, you can confirm that your upgrade was successful by doing the following:

  1. Return to the Authentication Center and select Users in the Manage menu on the left.

  2. Click View all users and verify that all user data has also been restored.

  3. Access the Anaconda Enterprise user console by visiting this URL in your browser: https://example.anaconda.com/—replacing example.anaconda.com with the FQDN of your server—and logging in using the same credential you used in your previous installation.

  4. Review the Projects list to verify that all project data has been restored.

Note

If you didn’t configure SSL certificates as part of the post-install configuration, do so now. See Updating TLS/SSL certificates for more information.


If you’re upgrading a cluster with external Git configured:

Note

The git section of the anaconda-enterprise-anaconda-platform.yml file used to configure Anaconda Enterprise 5.3.1 includes parameter changes. If you backed up your Anaconda Enterprise config map before upgrading, and copied it onto the newly-updated master node, you’ll need to update your config map with the new information as described here.


If you’re upgrading a Spark/Hadoop configuration:

After you successfully restore your Anaconda Enterprise data, run the following commands on the master node of the newly-installed Anaconda Enterprise server:

kubectl replace -f <path-to-anaconda-config-files-secrets.yaml>

To verify that your configuration upgraded correctly:

  1. Log in to Anaconda Enterprise.

  2. If your configuration uses Kerberos authentication, open a Hadoop terminal and authenticate yourself through Kerberos using the same credentials you used previously. For example, kinit <username>.

  3. Open a Jupyter Notebook that uses Sparkmagic, and verify that it behaves as expected. For example, run the sc command to connect to Sparkmagic and start Spark.


After you’ve confirmed that your upgrade was successful, we recommend you run the following command to remove all unused packages and images from previous versions of the application, and repopulate the registry to include only those images required by the current version of the application:

sudo gravity gc

The command’s progress is displayed in the terminal, so you can watch as it marks packages associated with the latest version as required, and deletes older versions.

If running the command generates an error, you can resume the command (after you fix the issue that caused the error) by running the following command:

sudo gravity gc —-resume

Upgrading from AE 5.1.0 to 5.1.2

This in-place upgrade process is recommended, as it requires almost no downtime.

  1. Download the 5.1.2 installer file.

  2. Add OS settings required for 5.1.2:

    sudo sysctl -w fs.may_detach_mounts=1
    sudo sysctl -w net.bridge.bridge-nf-call-iptables=1
    sudo sysctl -w net.ipv4.ip_forward=1
    

    Add settings to /etc/sysctl.conf:

    net.ipv4.ip_forward = 1
    net.bridge.bridge-nf-call-iptables = 1
    fs.may_detach_mounts = 1
    
  3. Run sudo ./upgrade

  4. Update the version of the app images in the configmap.

    First, edit the configmap:

    sudo gravity enter
    kubectl edit cm
    

    Next, in the configmap, update the app images to the new version of the images in the installer:

    data:
     anaconda-platform.yml
       images:
         app: apiserver:5000/ap-app:5.1.2-O
         app_proxy: apiserver:5000/ap-app-proxy:5.1.2-O
         editor: apiserver:5000/ap-editor:5.1.2-O
    
  5. Restart all Anaconda Enterprise pods:

    kubectl get pods | grep ap- | cut -d' ' -f1 | xargs kubectl delete pods
    

Upgrading from AE 5.0.x to 5.1.x

Upgrading your Anaconda Enterprise installation from version 5.0.x to 5.1.x requires the following:

  1. Backup all AE data.

  2. Uninstall the current version of AE.

  3. Install the newer version of AE.

Stage 1 – Backup Anaconda Enterprise

Note

All of the following commands should be run on the master node.

  1. Back up the Anaconda Enterprise configuration:

    sudo gravity backup anaconda-enterprise-backup.tar.gz
    
  2. Ensure all users have saved their work and logged out. To prevent any database transactions, stop the AE database with:

    sudo gravity enter
    kubectl delete deploy postgres
    
  3. Exit the Anaconda Enterprise environment by typing exit.

  4. All of the persistent data in AE is stored on the master node in /opt/anaconda/storage, you can backup your data by running the following command:

    sudo tar -zcvf anaconda-data.tar.gz /opt/anaconda/
    
  5. Restart the AE database:

    sudo gravity enter
    
    kubectl apply -f /var/lib/gravity/local/packages/unpacked/gravitational.io/AnacondaEnterprise/*/resources/postgres.yaml
    
    # Restart service pods
    kubectl get pods | grep ap- | cut -d' ' -f1 | xargs kubectl delete pods
    
  6. Exit the Anaconda Enterprise environment by typing exit.

Stage 2 – Uninstall Anaconda Enterprise

  1. To uninstall Anaconda Enterprise on a healthy master node, run:

    sudo gravity system uninstall
    sudo killall gravity
    sudo killall planet
    sudo rm -rf /var/lib/gravity
    

    If /var/lib/gravity is present after the uninstallation, you should reboot your machine and retry the sudo gravity system uninstall command.

  2. Reboot, to ensure that any Anaconda Enterprise state is flushed from your system.

Stage 3 – Install Anaconda Enterprise

  1. Download the installer file for the newer AE version.

  2. Follow the installation instructions to install Anaconda Enterprise, which will use the existing data in /opt/anaconda.

  3. Update the Anaconda Enterprise configuration to match the latest configuration schema. Note that we do not currently version the schema of the anaconda-platform.yml, so there may be incompatible changes between versions.

    Check the logs for each service for errors about new or missing fields. If you see any errors, manually update the configuration to match the new schema.

    Significant known schema changes, with the version they were added in, are detailed below:

    5.1.x

    The field format for specifying passive license information has changed. The field license.client-id is now license.number, and the field license.client-certificate is now license.key.

  4. Ensure that your SSL certificate filenames are correct.

    In Anaconda Enterprise 5.1.0 and newer, the default SSL certificate filenames provided by the installer are different than in previous versions. It is recommended that you update any Kubernetes secrets you created and update the Anaconda Enterprise configuration to match the new filenames.

    Previous

    Updated

    rootca.pem

    rootca.crt

    cert.pem

    server.crt

    privkey.pem

    server.key

    tls.crt

    wildcard.crt

    tls.key

    wildcard.key

Note

The keystore.jks filename is unchanged.

  1. Add roles and associate them with the appropriate users (if upgrading from 5.0.x):

    ae-admin
    ae-creator
    ae-deployer
    ae-uploader
    
  2. Restart all Anaconda Enterprise pods:

    kubectl get pods | grep ap- | cut -d' ' -f1 | xargs kubectl delete pods
    

Troubleshooting an upgrade

In-place upgrades from a version other than 5.1.0 to 5.1.2

If an attempt was made to perform an in-place upgrade from a version other than 5.1.0 to 5.1.2, the service pods will be in the ImagePullBackOff state.

To recover, execute the following command with the correct original version:

kubectl get deployments -n default -o yaml | sed \"s/:original-version/:5.1.2-0/g\" | kubectl replace -f - && kubectl get pods -n default | grep ap- | cut -d' ' -f1 | xargs kubectl delete pods -n default