Apache Livy and Anaconda Enterprise

To support your organization’s data analysis operations, Anaconda Enterprise enables platform users to connect to remote Apache Hadoop or Spark clusters. Anaconda Enterprise uses Apache Livy to handle session management and communication to Spark clusters, including different versions of Spark, independent clusters, and even different types of Hadoop distributions, such as those installed by different Cloudera Data Platform (CDP) Parcel versions.

Livy provides all the authentication layers that Hadoop administrators are familiar with, including Kerberos. Anaconda Enterprise can also authenticate to a Hadoop Distributed File System (HDFS) using Kerberos when Kerberos Impersonation is enabled.

Selecting a Spark template when creating a project will connect users to the remote Spark cluster where Livy is installed. They can use the Python libraries available through the platform or package a specific environment for the job. For more information, see Hadoop / Spark.

Tested Versions:

Anaconda Enterprise has been verified against the following versions.

Software

Version

Hadoop (Includes YARN and HDFS)

3.1.1

Spark

2.4.7

Hive

3.1.3000

Impala

3.4.0

Livy

0.7.1-incubating

Note

Anaconda Enterprise has also been verified against Cloudera Data Platform 7.1.7