Connecting to Cassandra

Anaconda Enterprise enables you to connect to an Apache Cassandra NoSQL database to access data in its wide column store.

Before you can do so, however, you’ll need to import the libraries that are required to connect to the Cassandra cluster.

NOTE: Currently the cassandra-driver is only available for Python 2.7 on conda-forge, so the command to install it looks like the following:

conda install -c conda-forge cassandra-driver

When you use conda install to add a package during a session, the project is impacted temporarily, during the current session only. If you want the change to persist for future project sessions and deployments, be sure to add the package to the project’s anaconda-project.yml file. For more information, see Developing a project.

After you’ve installed the correct driver, you can then use code such as this to access the cluster from within a notebook session:

from cassandra.auth import PlainTextAuthProvider
from cassandra.cluster import Cluster

import json

"""
Get credentials from Kubernetes. The credentials were set up as a dictionary. For example:
{
    "username": "USERNAME",
    "password": "PASSWORD"
}
"""
credentials = None
with open('/var/run/secrets/user_credentials/cassandra_credentials') as f:
    credentials = json.load(f)

# Verify the credentials were pulled correctly
if credentials:
    # Setup authentication mechanism
    auth_provider = PlainTextAuthProvider(
        username=credentials.get('username'),
        password=credentials.get('password')
    )

    # Pass parameters to the cluster
    cluster = Cluster(
        auth_provider=auth_provider,
        contact_points=['support-cassandra.dev.anaconda.com']
    )

    # Connect to cluster and set the keyspace
    session = cluster.connect()
    session.set_keyspace('quote')

    # Run query in keyspace and print out the results
    rows = session.execute('SELECT * FROM historical_prices')
    for row in rows:
        print(row)

    # Disconnect from the cluster
    cluster.shutdown()

See Storing secrets for information about adding credentials to the platform, to make them available in your projects. Any secrets you add will be available across all sessions and deployments associated with your user account.