Installation requirements

When you initially install Anaconda Enterprise, you can install the cluster on one to five nodes. You are not bound to that initial configuration, however. After completing the installation, you can add or remove nodes on the cluster as needed. For more information, see Adding and removing nodes.

A rule of thumb for determining how to size your system is 1 CPU, 1GB of RAM and 5 GB of disk space for each project session or deployment. For more information about sizing for a particular component, see the following minimum requirements:

To use Anaconda Enterprise with a cloud platform, refer to Cloud installation requirements.

To use Spark Hadoop data sources with Anaconda Enterprise, refer to Installing Livy server for Hadoop Spark access and Configuring Livy server for Hadoop Spark access.

Hardware requirements

The following are minimum recommended specifications for the master and worker nodes, as well as the entire cluster:

Master node Minimum Recommended
CPU 8 cores 16 cores
RAM 16GB 32GB
Disk space in /opt/anaconda 100GB 500GB*
Disk space in /var/lib/gravity 100GB 100GB
Disk space in /tmp or $TMPDIR 30GB 30GB
Worker nodes Minimum Recommended
CPU 8 cores 16 cores
RAM 16GB 32GB
Disk space in /var/lib/gravity 100GB 100GB
Disk space in /tmp or $TMPDIR 30GB 30GB
Cluster totals Minimum
CPU 16 cores
RAM 32GB

*Notes regarding the recommended disk space in /opt/anaconda:

  • This total includes project and package storage (including mirrored packages).

  • Currently /opt/anaconda must be a supported filesystem such as ext4 or xfs and cannot be an NFS mountpoint. Subdirectories of /opt/anaconda may be mounted through NFS. See Mounting an NFS share for more information.

  • If you are installing Anaconda Enterprise on an xfs filesystem, it needs to support d_type to work properly. If your XFS filesystem has been formatted with the -n ftype=0 option, it won’t support d_type, and will therefore need to be recreated using a command similar to the following before installing Anaconda Enterprise:

    mkfs.xfs -n ftype=1 /path/to/your/device
    

To check the number of cores, run nproc.

Disk IOPS requirements

Master and worker nodes require a minimum of 3000 concurrent input/output operations per second (IOPS)–fewer than 3000 concurrent IOPS will fail. Cloud providers report concurrent disk IOPS.

Hard disk manufacturers report sequential IOPS, which are different than concurrent IOPS. On-premises installations require servers with disks that support a minimum of 50 sequential IOPS, typically 7200 revolutions per minute (RPM) disks or faster.

Storage and memory requirements

Approximately 30GB of available free space on each node is required for the Anaconda Enterprise installer to temporarily decompress files to the /tmp directory during the installation process.

If adequate free space is not available in the /tmp directory, you can specify the location of the temporary directory to be used during installation by setting the TMPDIR environment variable to a different location.

Example:

sudo TMPDIR=/tmp2 ./gravity install

NOTE: When using sudo to install, the temporary directory must be set explicitly in the command line to preserve TMPDIR. The master node and each worker node all require a temporary directory of the same size, and should each use the TMPDIR variable as needed. Alternatively, you can install as root.

To check your available disk space, use the built-in Linux df utility with the -h parameter for human readable format:

df -h /var/lib/gravity

df -h /opt/anaconda

df -h /tmp
# or
df -h $TMPDIR

To show the free memory size in GB, run:

free -g

Operating system requirements

Anaconda Enterprise cannot be installed with heterogeneous versions in the same cluster. Before installing, verify that all cluster nodes are operating the same version of the OS.

Anaconda Enterprise currently supports the following Linux versions:

  • RHEL/CentOS 7.2, 7.3, 7.4, 7.5
  • Ubuntu 16.04
  • SUSE 12 SP2, 12 SP3
  • Hosted vSphere such as Rackspace or OVH

NOTE: On SUSE, set DefaultTasksMax=infinity in /etc/systemd/system.conf.

Optionally create a new directory and set TMPDIR. User 1000 (or the UID for the service account) needs to be able to write to this directory. This means they can read, write and execute on the $TMPDIR.

For example, to give write access to UID 1000, run the following command:

sudo chown 1000 -R $TMPDIR

Security requirements

For CentOS and RHEL:

  • Disable SELinux by ensuring that SELINUX=disabled in the /etc/selinux/config file for all of the cluster nodes. After rebooting, run the following command to verify that SELinux is disabled:

    ~]~ getenforce
    Disabled
    
  • Various tools may be used to configure firewalls and open required ports, including iptables, firewall-cmd, susefirewall2, and others.

    Make sure that the firewall is permanently set to keep the required ports open, and will save these settings across reboots. Then restart the firewall to load these settings immediately.

  • Sudo access.

Kernel module requirements

The Anaconda Enterprise installer checks to see if the following modules required for Kubernetes to function properly are present, and alerts you if any are not loaded:

Linux Distribution Version Modules
CentOS 7.2 bridge, ebtables, iptable_filter, overlay
RedHat Linux 7.2 bridge, ebtables, iptable_filter
CentOS 7.3, 7.4, 7.5 br_netfilter, ebtables, iptable_filter, overlay
RedHat Linux 7.3, 7.4, 7.5 br_netfilter, ebtables, iptable_filter, overlay
Ubuntu 16.04 br_netfilter, ebtables, iptable_filter, overlay
Suse 12 SP2, 12 SP3 br_netfilter, ebtables, iptable_filter, overlay

br_netfilter module

The bridge netfilter kernel module is required for Kubernetes iptables-based proxy to work correctly.

The bridge kernel module commands are different for different versions of CentOS.

To find your operating system version run cat /etc/*release* or lsb-release -a.

On RHEL/CentOS 7.2 the bridge netfilter module name is bridge and on other operating systems and other versions of CentOS the module name is br_netfilter.

To check if the module is loaded run:

# For RHEL/CentOS 7.2
lsmod | grep bridge

# For all other supported platforms
lsmod | grep br_netfilter

If the above commands did not produce any result, then the module is not loaded. Run the following command to load the module:

# For RHEL/CentOS 7.2
sudo modprobe bridge

# For all other supported platforms
sudo modprobe br_netfilter

Now run:

sudo sysctl -w net.bridge.bridge-nf-call-iptables=1
sudo sysctl -w net.bridge.bridge-nf-call-ip6tables=1

To persist this setting on boot, run:

sudo echo "# Enable bridge module" >> /etc/sysctl.d/99-bridge.conf
sudo echo "net.bridge.bridge-nf-call-iptables=1" >> /etc/sysctl.d/99-bridge.conf
sudo echo "net.bridge.bridge-nf-call-ip6tables=1" >> /etc/sysctl.d/99-bridge.conf

overlay module

The overlay kernel module is required to use overlay or overlay2 Docker storage driver.

To check that overlay module is loaded:

lsmod | grep overlay

If the above command did not produce any result, then the module is not loaded. Use the following command to load the module:

sudo modprobe overlay

ebtables module

The ebtables kernel module is required to allow a service to communicate back to itself via internal load balancing when necessary.

To check that ebtables module is loaded:

lsmod | grep ebtables

If the above command did not produce any result, then the module is not loaded. Use the following command to load the module:

sudo modprobe ebtables

iptable_filter

The iptable_filter kernel module is required to make sure firewall rules that Kubernetes sets up function properly.

To check that iptable_filter module is loaded:

lsmod | grep iptable_filter

If the above command did not produce any result, then the module is not loaded. Use the following command to load the module:

sudo modprobe iptable_filter

iptable_nat

The iptable_nat kernel module is required to make sure firewall rules that Kubernetes sets up function properly.

To check that iptable_nat module is loaded:

lsmod | grep iptable_nat

If the above command did not produce any result, then the module is not loaded. Use the following command to load the module:

sudo modprobe iptable_nat

NOTE: During installation, the Anaconda Enterprise installer alerts you if any of these modules are not loaded.

If your system does not load modules at boot, add the following to ensure they are loaded in case the machine gets rebooted:

sudo bash -c "echo 'overlay' > /etc/modules-load.d/overlay.conf"
sudo bash -c "echo 'br_netfilter' > /etc/modules-load.d/netfilter.conf"
sudo bash -c "echo 'ebtables' > /etc/modules-load.d/ebtables.conf"
sudo bash -c "echo 'iptable_filter' > /etc/modules-load.d/iptable_filter.conf"
sudo bash -c "echo 'iptable_nat' > /etc/modules-load.d/iptable_nat.conf"

Mount settings

Many linux distributions include the kernel setting fs.may_detach_mounts = 0. This can cause conflicts with the docker daemon and Kubernetes will show pods as stuck in the terminating state if docker is unable to clean up one of the underlying containers.

If the installed kernel exposes the option fs.may_detach_mounts, we recommend always setting this value to 1:

For CentOS and RHEL:

sudo sysctl -w fs.may_detach_mounts=1
sudo bash -c "echo 'fs.may_detach_mounts = 1' >> /usr/lib/sysctl.d/99-containers.conf"

For Ubuntu:

sudo sysctl -w fs.may_detach_mounts=1
sudo bash -c "echo 'fs.may_detach_mounts = 1' >> /etc/sysctl.d/10-may_detach_mounts.conf"

TLS/SSL certificate requirements

Anaconda Enterprise uses certificates to provide transport layer security for the cluster. To get you started, self-signed certificates are generated during the initial installation. You can configure the platform to use organizational TLS/SSL certificates after completing the installation.

You may purchase certificates commercially, or generate them using your organization’s internal public key infrastructure (PKI) system. When using an internal PKI-signed setup, the CA certificate is inserted into the Kubernetes secret.

In either case, the configuration will include the following:

  • a certificate for the root certificate authority (CA),
  • an intermediate certificate chain,
  • a server certificate, and
  • a private server key.

See Updating TLS/SSL certificates for more information.

GPU requirements

To use GPUs with Anaconda Enterprise, you’ll need to install the NVIDIA CUDA 9.2 driver on the host operating system of any GPU worker nodes. You can install the drivers using the package manager or the Nvidia runfile or by using rpm (local) or rpm (network) for SLES, CentOS, and RHEL, and deb(local) or deb (network) for Ubuntu.

Network requirements

The following ports are required during initial installation only, and can be closed after completing the install process.

Anaconda Enterprise needs the following network ports to be externally accessible:

Port Protocol Description
80 HTTP Anaconda Enterprise UI (plaintext)
443 HTTPS Anaconda Enterprise UI (encrypted)
4242 TCP Bandwidth checker utility
61009 HTTPS Install wizard UI access required during cluster installation
61008-61010, 61022-61024 HTTPS Installer agent ports

The following ports are used for cluster operation, and therefore must be open internally, between cluster nodes:

Port Protocol Description
53 TCP and UDP Internal cluster DNS
2379, 2380, 4001, 7001 HTTPS Etcd server communication
3008-3012 HTTPS Internal Anaconda Enterprise service
3022-3025 SSH Teleport internal SSH control panel
3080 HTTPS Teleport Web UI
5000 HTTPS Docker registry
6443 HTTPS Kubernetes API Server
6990 HTTPS Internal Anaconda Enterprise service
7496, 7373 TCP Peer-to-peer health check
7575 TCP Cluster status gRPC API
8081, 8086-8091, 8095 HTTPS Internal Anaconda Enterprise service
8472 VXLAN (UDP encapsulation) Overlay network
9080, 9090, 9091 HTTPS Internal Anaconda Enterprise service
10248-10250, 10255 HTTPS Kubernetes components
30000-32767 HTTPS (depends on services) Kubernetes internal services range
32009 HTTPS Internal Operations Center Admin UI

If you plan to use online package mirroring, you’ll need to whitelist the following domains:

  • repo.continuum.io
  • anaconda.org
  • conda.anaconda.org
  • binstar-cio-packages-prod.s3.amazonaws.com

IPv4 forwarding on servers is required for internal load balancing and must be turned on. Anaconda Enterprise performs pre-flight checks and only allows installation on nodes that have the required kernel modules and other correct configuration.

To enable IPv4 forwarding, run:

sudo sysctl -w net.ipv4.ip_forward=1

To persist this setting on boot, run:

sudo echo -e "# Enable IPv4 forwarding\nnet.ipv4.ip_forward=1" >> /etc/sysctl.d/99-ipv4_forward.conf

If any Anaconda Enterprise users will use the local graphical program Anaconda Navigator in online mode, they will need access to these sites, which may need to be whitelisted in your network’s firewall settings.

DNS requirements

Web browsers use domain names and web origins to separate sites, so they cannot tamper with each other. Anaconda includes deployments from many users, and if these deployments had addresses on the same domain, such as https://anaconda.yourdomain.com/apps/001 and https://anaconda.yourdomain.com/apps/002, one app could access the cookies of the other, and JavaScript in one app could access the other app.

To prevent this potential security risk, Anaconda assigns deployments unique addresses such as https://uuid001.anaconda.yourdomain.com and https://uuid002.anaconda.yourdomain.com, where `` yourdomain.com`` is replaced with your organization’s domain name, and uuid001 and uuid002 is replaced with dynamically generated universally unique identifiers (UUIDs), for example.

To facilitate this, Anaconda Enterprise requires the use of wildcard DNS entries that apply to a set of domain names such as *.anaconda.yourdomain.com.

For example, if you are using the fully qualified domain name (FQDN) anaconda.yourdomain.com with a master node IP address of 12.34.56.78, the DNS entries would be as follows:

  anaconda.yourdomain.com IN A 12.34.56.78
*.anaconda.yourdomain.com IN A 12.34.56.78

The wildcard subdomain’s DNS entry points to the Anaconda Enterprise master node.

The master node’s hostname and the wildcard domains must be resolvable with DNS from the master nodes, the worker nodes, and the end user machines. To ensure the master node can resolve its own hostname, any /etc/hosts entries used must be propagated to the gravity environment.

Browser requirements

Anaconda Enterprise supports the following web browsers:

  • Chrome 39+
  • Firefox 49+
  • Safari 10+
  • Edge 14+
  • Internet Explorer 11+ (Windows 10)

The minimum browser screen size for using the platform is 800 pixels wide and 600 pixels high.

NOTE: JupyterLab doesn’t currently support Edge or Internet Explorer, and Zeppelin has issues running on Edge, so Anaconda Enterprise users will have to use another editor for their Notebook sessions if they choose to use either of those browsers to access the AE platform.

Verifying system requirements

Anaconda Enterprise performs system checks during the install to verify CPU, RAM and other system requirements. The system checks can also be performed manually before the installation using the following commands from the installer directory, ~/anaconda-enterprise-<installer-version>.

NOTE: You can perform this check after downloading and extracting the installer.

To perform system checks on a master node, run the following command as sudo or root user:

sudo ./gravity check --profile ae-master

To perform system checks on a worker node, run the following command as sudo or root user:

sudo ./gravity check --profile ae-worker

If all of the system checks pass and all requirements are met, the output from the above commands will be empty. If the system checks fail and some requirements are not met, the output will indicate which system checks failed.