Installation requirements¶
For your Anaconda Enterprise installation to complete successfully, your systems must meet the requirements outlined below. The installation requirements for Anaconda Enterprise are the same whether you choose to install the platform on-premises, hosted VSphere, or on a cloud server. There are cloud-specific requirements related to performance, however, so ensure your chosen cloud platform meets the minimum specifications outlined here before you begin.
The installer performs pre-flight checks, and only allows installation to continue on nodes that are configured correctly, and include the required kernel modules. If you want to perform the system check yourself, before installation, you can run the command on your intended master and worker nodes after you download and extract the installer.
When you initially install Anaconda Enterprise, you can install the cluster on one to five nodes. You are not bound to that initial configuration, however. After completing the installation, you can add or remove nodes on the cluster as needed. For more information, see Adding and removing nodes.
A rule of thumb for determining how to size your system is 1 CPU, 1GB of RAM and 5 GB of disk space for each project session or deployment. For more information about sizing for a particular component, see the following minimum requirements:
To use Anaconda Enterprise with a cloud platform, refer to Cloud performance requirements for cloud-specific performance requirements.
To use Spark Hadoop data sources with Anaconda Enterprise, refer to Apache Livy and Anaconda Enterprise and Configuring Livy server for Hadoop Spark access.
To verify your systems meet the requirements, see Verifying system requirements.
Note
To gain a deeper understanding of the considerations around Anaconda Enterprise system requirements, you may visit our Understanding Anaconda Enterprise system requirements topic.
Hardware requirements
The following are minimum specifications for the master and worker nodes, as well as the entire cluster.
Note
Anaconda recommends having 1 master and 1 worker per cluster.
Master node |
Minimum |
---|---|
CPU |
16 cores |
RAM |
64GB |
Disk space in /opt/anaconda |
500GB* |
Disk space in /var/lib/gravity |
300GB** |
Disk space in /tmp or $TMPDIR |
50GB |
Worker nodes |
Minimum |
---|---|
CPU |
16 cores |
RAM |
64GB |
Disk space in /var/lib/gravity |
300GB |
Disk space in /tmp or $TMPDIR |
50GB |
*NOTES regarding the minimum disk space in /opt/anaconda
:
This total includes project and package storage (including mirrored packages).
Currently
/opt
and/opt/anaconda
must be anext4
orxfs
filesystem, and cannot be an NFS mountpoint. Subdirectories of/opt/anaconda
may be mounted through NFS. See Mounting an external file share for more information.If you are installing Anaconda Enterprise on an
xfs
filesystem, it needs to supportd_type
to work properly. If your XFS filesystem has been formatted with the-n ftype=0
option, it won’t supportd_type
, and will therefore need to be recreated using a command similar to the following before installing Anaconda Enterprise:
**NOTES regarding the minumum disk space in /var/lib/gravity
:
This volume MUST be mounted on local storage. Core components of Kubernetes run from this directory, some of which are extremely intolerant of disk latency. Network-Attached Storage (NAS) and Storage Area Network (SAN) solutions are susceptible to latency, and are therefore not supported.
This total includes additional space to accommodate upgrades, and is recommended to have available during installation as it can be difficult to add space after the fact.
We strongly recommend that you set up the
/opt/anaconda
and/var/lib/gravity
partitions using Logical Volume Management (LVM), to provide the flexibility needed to accomodate easier future expansion.
To check the number of cores, run nproc
.
Disk IOPS requirements
Master and worker nodes require a minimum of 3000 concurrent input/output operations per second (IOPS)–fewer than 3000 concurrent IOPS will fail. Cloud providers report concurrent disk IOPS.
Hard disk manufacturers report sequential IOPS, which are different than concurrent IOPS. On-premises installations require servers with disks that support a minimum of 50 sequential IOPS. Anaconda recommends using SSD or better.
Storage and memory requirements
Approximately 50GB of available free space on each node is required for the Anaconda Enterprise installer to temporarily decompress files to the /tmp
directory during the installation process.
If adequate free space is not available in the /tmp
directory, you can specify the location of the temporary directory to be used during installation by setting the TMPDIR
environment variable to a different location.
EXAMPLE:
Note
When using sudo
to install, the temporary directory must be set explicitly in the command line to preserve TMPDIR
. The master node and each worker node all require a temporary directory of the same size, and should each use the TMPDIR
variable as needed.
To check your available disk space, use the built-in Linux df
utility with the -h
parameter for human readable format:
To show the free memory size in GB, run:
Operating system requirements
Anaconda Enterprise cannot be installed with heterogeneous versions in the same cluster. Before installing, verify that all cluster nodes are operating the same version of the OS.
Anaconda Enterprise currently supports the following Linux versions:
RHEL/CentOS 7.x, 8.x
Ubuntu 16.04
SUSE 12 SP2, 12 SP3, 12 SP5 Requirement: Set
DefaultTasksMax=infinity
in/etc/systemd/system.conf
.
Note
Please note that the RHEL 8.4 AMI in AWS is currently bugged due to a combination of a bad
ip rule
and the networkmanager service. You will need to remove the bad rule and disable the networkmanager service prior to installTo find your operating system version run
cat /etc/*release*
orlsb-release -a
.Optionally create a new directory and set
TMPDIR
. User 1000 (or the UID for the service account) needs to be able to write to this directory. This means they can read, write and execute on the$TMPDIR
.For example, to give write access to UID 1000, run the following command:
Note
When installing Anaconda Enterprise on a system with multiple nodes, verify that the clock of each node is in sync with the others prior to starting the installation process, to avoid potential issues. Anaconda recommends using the Network Time Protocol (NTP) to synchronize computer system clocks automatically over a network. See instructions here.
Security requirements
If you use an antivirus scanner, such as Auditd or Antivirus, ensure the scanner excludes the /var/lib/gravity folder from its security scans.
Verify you have
sudo
access.Make sure that the firewall is permanently set to keep the required ports open, and will save these settings across reboots. Then restart the firewall to load these settings immediately.
Various tools may be used to configure firewalls and open required ports, including
iptables
,firewall-cmd
,susefirewall2
, and others.
For all CentOS and RHEL nodes:
Ensure that SELinux is not in
enforcing
mode, by either disabling it or putting it inpermissive
mode in the/etc/selinux/config
file.
After rebooting, run the following command to verify that SELinux is not being enforced:
The result should be either Disabled
or Permissive
.
Kernel module requirements
The Anaconda Enterprise installer checks to see if the following modules required for Kubernetes to function properly are present, and alerts you if any are not loaded:
Linux Distribution |
Version Modules |
|
---|---|---|
CentOS |
7.2 |
bridge, ebtable_filter, ebtables, iptable_filter, overlay |
RedHat Linux |
7.2 |
bridge, ebtable_filter, ebtables, iptable_filter |
CentOS |
7.3, 7.4, 7.5, 7.6, 7.7, 8.0 |
br_netfilter, ebtable_filter, ebtables, iptable_filter, overlay |
RedHat Linux |
7.3, 7.4, 7.5, 7.6, 7.7, 8.0 |
br_netfilter, ebtable_filter, ebtables, iptable_filter, overlay |
Ubuntu |
16.04 |
br_netfilter, ebtable_filter, ebtables, ebtable_filter, iptable_filter, overlay |
Suse |
12 SP2, 12 SP3 |
br_netfilter, ebtable_filter, ebtables, iptable_filter, overlay |
Module name |
Purpose |
---|---|
bridge |
Required for Kubernetes iptables-based proxy to work correctly |
br_netfilter |
Required for Kubernetes iptables-based proxy to work correctly |
overlay |
Required to use overlay or overlay2 Docker storage driver |
ebtable_filter |
Required to allow a service to communicate back to itself via internal load balancing when necessary |
ebtables |
Required to allow a service to communicate back to itself via internal load balancing when necessary |
iptable_filter |
Required to make sure that the firewall rules that Kubernetes sets up function properly |
iptable_nat |
Required to make sure that the firewall rules that Kubernetes sets up function properly |
To check if a particular module is loaded, run the following command:
If the command doesn’t produce any result, the module is not loaded.
Run the following command to load the module:
If your system does not load modules at boot, run the following—for each module—to ensure they are loaded upon reboot:
System control settings
Anaconda Enterprise requires the following sysctl
settings to function properly:
System setting |
Purpose |
---|---|
net.bridge.bridge-nf-call-iptables |
Works with bridge kernel module to ensure Kubernetes iptables-based proxy works correctly |
net.bridge.bridge-nf-call-ip6tables |
Works with bridge kernel module to ensure Kubernetes iptables-based proxy works correctly |
fs.may_detach_mounts |
Can cause conflicts with the docker daemon, and leave pods in stuck state if not enabled |
net.ipv4.ip_forward |
Required for internal load balancing between servers to work properly |
fs.inotify.max_user_watches |
Set to 1048576 to improve cluster longevity |
Run the following commands to set system control settings:
To persist system settings on boot, run the following for each setting:
Verifying system requirements
Anaconda Enterprise performs system checks during the install
to verify CPU, RAM and other system requirements. The system checks
can also be performed manually before the installation using the following commands
from the installer directory, ~/anaconda-enterprise-<installer-version>
.
Note
You can perform this check after downloading and extracting the installer.
To perform system checks on a master node, run the following command as sudo or root user:
To perform system checks on a worker node, run the following command as sudo or root user:
If all of the system checks pass and all requirements are met, the output from the above commands will be empty. If the system checks fail and some requirements are not met, the output will indicate which system checks failed.
GPU requirements
To use GPUs with Anaconda Enterprise, you’ll need to install one of the supported versions of the NVIDIA CUDA driver on the host operating system of any GPU worker nodes. You can install the drivers using the package manager or the Nvidia runfile or by using rpm (local)
or rpm (network)
for SLES, CentOS, and RHEL, and deb(local)
or deb (network)
for Ubuntu.
Current supported CUDA Driver versions:
CUDA 10.2
CUDA 11.2
CUDA 11.4
CUDA 11.6
Please note that you will need to notify our Integration team with the CUDA version you are intending to use, so that the correct installer will be provided.
GPU deployments should use one of the following models:
Tesla V100 (recommended)
Tesla P100 (adequate)
We have not tested the other cards supported by this driver, however, we do expect this full list to work with your cluster, provided the proper installation steps are followed:
A-Series: NVIDIA A100, NVIDIA A40, NVIDIA A30, NVIDIA A10
RTX-Series: RTX 8000, RTX 6000, NVIDIA RTX A6000, NVIDIA RTX A5000, NVIDIA RTX A4000, NVIDIA T1000, NVIDIA T600, NVIDIA T400
HGX-Series: HGX A100, HGX-2
T-Series: Tesla T4
P-Series: Tesla P40, Tesla P6, Tesla P4
K-Series: Tesla K80, Tesla K520, Tesla K40c, Tesla K40m, Tesla K40s, Tesla K40st, Tesla K40t, Tesla K20Xm, Tesla K20m, Tesla K20s, Tesla K20c, Tesla K10, Tesla K8
M-Class: M60, M40 24GB, M40, M6, M4
Network requirements
Anaconda Enterprise requires the following network ports to be externally accessible:
Port |
Protocol |
Description |
---|---|---|
80 |
TCP |
Anaconda Enterprise UI (plaintext) |
443 |
TCP |
Anaconda Enterprise UI (encrypted) |
32009 |
TCP |
Operations Center Admin UI |
These ports need to be externally accessible during installation only, and can be closed after completing the install process:
Port |
Protocol |
Description |
---|---|---|
4242 |
TCP |
Bandwidth checker utility |
61009 |
TCP |
Install wizard UI access required during cluster installation |
61008, 61010, 61022-61024 |
TCP |
Installer agent ports |
The following ports are used for cluster operation, and therefore must be open internally, between cluster nodes:
Port |
Protocol |
Description |
---|---|---|
53 |
TCP and UDP |
Internal cluster DNS |
2379, 2380, 4001, 7001 |
TCP |
Etcd server communication |
3008-3012 |
TCP |
Internal Anaconda Enterprise service |
3022-3025 |
TCP |
Teleport internal SSH control panel |
3080 |
TCP |
Teleport Web UI |
5000 |
TCP |
Docker registry |
6443 |
TCP |
Kubernetes API Server |
6990 |
TCP |
Internal Anaconda Enterprise service |
7496, 7373 |
TCP |
Peer-to-peer health check |
7575 |
TCP |
Cluster status gRPC API |
8081, 8086-8091, 8095 |
TCP |
Internal Anaconda Enterprise service |
8472 |
UDP |
Overlay network |
9080, 9090, 9091 |
TCP |
Internal Anaconda Enterprise service |
10248-10250, 10255 |
TCP |
Kubernetes components |
30000-32767 |
TCP |
Kubernetes internal services range |
You’ll also need to update your firewall settings to ensure that the 10.244.0.0/16
pod subnet and 10.100.0.0/16
service subnet are accessible to every node in the cluster, and grant all nodes the ability to communicate via their primary interface.
For example, if you’re using iptables
:
Where <node_ip>
specifies the internal IP address(es) used by all nodes in the cluster to connect to the AE5 master.
If you plan to use online package mirroring, you’ll need to allowlist the following domains:
repo.anaconda.com
anaconda.org
conda.anaconda.org
binstar-cio-packages-prod.s3.amazonaws.com
If any Anaconda Enterprise users will use the local graphical program Anaconda Navigator in online mode, they will need access to these sites, which may need to be allowlisted in your network’s firewall settings.
https://repo.anaconda.com (or for older versions of Navigator and Conda)
https://conda.anaconda.org if any users will use conda-forge and other channels on Anaconda.org
https://vscode-update.azurewebsites.net/ if any users will install Visual Studio Code
google-public-dns-a.google.com (8.8.8.8:53) to check internet connectivity with Google Public DNS
TLS/SSL certificate requirements
Anaconda Enterprise uses certificates to provide transport layer security for the cluster. To get you started, self-signed certificates are generated during the initial installation. You can configure the platform to use organizational TLS/SSL certificates after completing the installation.
You may purchase certificates commercially, or generate them using your organization’s internal public key infrastructure (PKI) system. When using an internal PKI-signed setup, the CA certificate is inserted into the Kubernetes secret.
In either case, the configuration will include the following:
a certificate for the root certificate authority (CA),
an intermediate certificate chain,
a server certificate, and
a certificate private key.
See Updating TLS/SSL certificates for more information.
DNS requirements
Web browsers use domain names and web origins to separate sites, so they cannot tamper with each other. Anaconda includes deployments from many users, and if these deployments had addresses on the same domain, such as https://anaconda.yourdomain.com/apps/001
and
https://anaconda.yourdomain.com/apps/002
, one app could access the cookies of the other, and JavaScript in one app could access the other app.
To prevent this potential security risk, Anaconda assigns deployments unique addresses such as
https://uuid001.anaconda.yourdomain.com
and
https://uuid002.anaconda.yourdomain.com
, where `` yourdomain.com`` is replaced with your organization’s domain name, and uuid001
and uuid002
is replaced with dynamically generated universally unique identifiers (UUIDs), for example.
To facilitate this, Anaconda Enterprise requires the use of wildcard DNS entries that apply to a set of domain names such as *.anaconda.yourdomain.com
.
For example, if you are using the fully qualified domain name (FQDN) anaconda.yourdomain.com
with a master node IP address of 12.34.56.78
, the DNS entries would be as follows:
The wildcard subdomain’s DNS entry points to the Anaconda Enterprise master node.
The master node’s hostname and the wildcard domains must be resolvable with DNS
from the master nodes, the worker nodes, and the end user machines. To ensure
the master node can resolve its own hostname, any /etc/hosts
entries used
must be propagated to the gravity environment.
Existing installations of dnsmasq
will conflict with Anaconda Enterprise. If dnsmasq
is installed on the master node or any worker nodes, you’ll need to remove it from all nodes before installing Anaconda Enterprise.
Run the following commands to ensure dnsmasq
is stopped and disabled:
To stop
dnsmasq
:sudo systemctl stop dnsmasq
To disable
dnsmasq
:sudo systemctl disable dnsmasq
To verify
dnsmasq
is disabled:sudo systemctl status dnsmasq
Browser requirements
Anaconda Enterprise supports the following web browsers:
Chrome 39+
Firefox 49+
Safari 10+
The minimum browser screen size for using the platform is 800 pixels wide and 600 pixels high.
Note
JupyterLab and Jupyter Notebook don’t currently support Internet Explorer, so Anaconda Enterprise users will have to use another editor for their Notebook sessions if they choose to use that browser to access the AE platform.