Connecting to Cassandra#
To connect Data Science & AI Workbench to an Apache Cassandra NoSQL database and access data from its wide column store, you must first import the necessary libraries. Installing the cassandra-driver
package in your project environment enables communication with the Cassandra cluster using its binary protocol and query language.
Install the cassandra-driver
package by running the following command:
conda install cassandra-driver
When you use conda install
to add a package during a session, the project is impacted temporarily, during the current session only. If you want the change to persist for future project sessions and deployments, be sure to add the package to the project’s anaconda-project.yml
file. For more information, see Project configurations.
After you’ve installed the correct driver, you can then use code such as this to access the cluster from within a notebook session:
from cassandra.auth import PlainTextAuthProvider
from cassandra.cluster import Cluster
import json
"""
Get credentials from Kubernetes. The credentials were set up as a dictionary. For example:
{
"username": "USERNAME",
"password": "PASSWORD"
}
"""
credentials = None
with open('/var/run/secrets/user_credentials/cassandra_credentials') as f:
credentials = json.load(f)
# Verify the credentials were pulled correctly
if credentials:
# Setup authentication mechanism
auth_provider = PlainTextAuthProvider(
username=credentials.get('username'),
password=credentials.get('password')
)
# Pass parameters to the cluster
cluster = Cluster(
auth_provider=auth_provider,
contact_points=['support-cassandra.dev.anaconda.com']
)
# Connect to cluster and set the keyspace
session = cluster.connect()
session.set_keyspace('quote')
# Run query in keyspace and print out the results
rows = session.execute('SELECT * FROM historical_prices')
for row in rows:
print(row)
# Disconnect from the cluster
cluster.shutdown()
See Secrets for information about adding credentials to the platform, to make them available in your projects. Any secrets you add will be available across all sessions and deployments associated with your user account.