Connecting to MS SQL using Kerberos authentication#
Data Science & AI Workbench enables you to use Kerberos authentication for trusted connections to an MS SQL server database. After you’ve authenticated, you’ll be able to access the data stored in it.
Before you can do so, you’ll need to rebuild the freetds conda package. FreeTDS is a set of libraries that allows programs to work with MS SQL Server. It doesn’t currently support Kerberos authentication, however, so you’ll need to enable that flag and rebuild the package.
Then you can configure Workbench so all projects will be able to use Kerberos to connect to an MS SQL database.
Note
You’ll need access to an MS SQL instance, to be able to verify the connection. You’ll also need to have configured a krb5.conf
file to use Kerberos authentication.
Rebuilding FreeTDS#
Run the following command to pull down the existing recipe for the freetds conda package located here.:
git clone https://github.com/AnacondaRecipes/freetds-feedstock.git
Open the recipe used to build the package:
cd freetds-feedstock/recipe
Run the following commands to open the recipe, and update it to enable krb5:
# Add the option to enable krb5 cat <<EOF > build.sh #!/bin/bash set -e ./configure \\ --prefix=\$PREFIX \\ --with-unixodbc=\$PREFIX \\ --enable-krb5 \\ --with-openssl=\$PREFIX \\ --host=\$HOST \\ --build=\$BUILD make # To run this check, you need to have access to an mssql instance. # make check make install EOF
Open the
meta.yml
file included in the recipe, and turn on line numbers:vi meta.yml :set number
Make the following changes to the
meta.yaml
file:
Change the version by editing line 2 so it looks like this:
{% set version = "1.1.4" %}
Comment out line 3.
Comment out lines 12-14.
Add
krb5
to therequirements:host
section.
Your updated meta.yaml
file should look like this:
{% set name = 'freetds' %}
{% set version = "1.1.4" %}
package:
name: {{ name }}
version: {{ version }}
source:
fn: freetds-{{ version }}.tar.gz
url: ftp://ftp.freetds.org/pub/freetds/stable/freetds-{{ version }}.tar.gz
build:
number: 0
run_exports:
- {{ pin_subpackage('freetds') }}
requirements:
build:
- {{ compiler('c') }}
host:
- openssl
- krb5
- unixodbc # [unix]
- readline # [not win]
- make # [not win]
- cmake # [win]
- jom # [win]
- libiconv # [osx]
run:
# Should be taken care of by run_exports
# - unixodbc # [unix]
# - readline # [not win]
test:
commands:
- tsql -C
- conda inspect linkages freetds # [linux or osx]
- conda inspect objects freetds # [osx]
about:
home: http://www.freetds.org/
license: GPL-2
license_file: COPYING
summary: FreeTDS is a free implementation of Sybases DB-Library, CT-Library, and ODBC libraries
license_family: GPL2
doc_url: http://www.freetds.org/docs.html
dev_url: https://github.com/FreeTDS/freetds
extra:
recipe-maintainers:
- mariusvniekerk
Move up a directory, to enter
freetds-feedstock
:cd ..
Run the following command to build the package:
conda build recipe
After the build process completes successfully, you’ll have a freetds
package that supports Kerberos authentication. You can now upload this package to a channel, to use it in your projects and optionally share it with other users.
When complete, the output of the process will display the path to the package. Now you can upload it to your preferred channel, to make it available for platform users. For more information, see uploading a package to a channel.
Configuring Workbench#
Note
The following steps need to be performed on the master node of the Workbench cluster, so if you don’t have access to it, you’ll need to ask an Administrator to do the following:
On the master node, create a
freetds.conf
file for your organization that looks similar to the following (replacing placeholder values with actual values):[SERVERNAME] host = MSSQL-SERVER-DOMAIN (e.g., win.dev.anaconda.com) port = 1433 # Port for MSSQL 1433 is the default enable gssapi delegation = on realm = KERBEROS-REALM (e.g., DEV.ANACONDA.COM)
On the master node, create an
odbcinst.ini
file that looks similar to the following:[FreeTDS] Description=FreeTDS Driver for Linux & MSSQL Driver=/opt/continuum/anaconda/envs/anaconda50_py36/lib/libtdsodbc.so Setup=/opt/continuum/anaconda/envs/anaconda50_py36/lib/libtdsodbc.so UsageCount=1
Note
This example uses the Python 3.6 kernel. The location of the driver might be different if you are using a different kernel version.
On the master node, create an
odbc.ini
file that looks similar to the following:[SERVERNAME] Description = Test to SQLServer Driver = FreeTDS Servername = SERVERNAME
After you’ve created these files, use the
anaconda-enterprise-cli
to ensure they are placed in every project. If the files are all in the current working directory, you can do this by running the following command on the master node:anaconda-enterprise-cli spark-config --config /opt/continuum/.freetds.conf freetds.conf --config /etc/odbc.ini odbc.ini --config /etc/odbcinst.ini odbcinst.ini --config /etc/krb5.conf krb5.conf
Run this command to update the secret in Kubernetes:
sudo kubectl replace -f anaconda-config-files-secret.yaml
Connecting to MS SQL from a project#
Create or open a project that uses the Python 3.6 template.
To ensure that the correct libraries are installed, open the terminal within the project and run the following command:
conda install freetds
Note
Any packages you install from the command line are available during the current session only. If you want them to persist, add them to the project’s anaconda-project.yml
file. For more information, see Project configurations.
When the install is complete, run the following command to verify whether you can successfully authenticate using Kerberos, replacing
<USERNAME>
with your credentials:kinit <USERNAME>
If you were able to authenticate, run these commands to ensure you can connect to the MS SQL server, replacing
<SERVERNAME>
with the value you specified in thefreetds.conf
andodbc.ini
files.:tsql -S <SERVERNAME> isql <SERVERNAME>
If you were able to successfully authenticate using Kerberos, you can then use code such as this to connect to the MS SQL database from within a notebook session:
import pyodbc
connection = pyodbc.connect('DSN=SERVERNAME') # SERVERNAME is the same as the above commands for tsql and isql
cursor = connection.cursor()
rows = cursor.execute("select @@VERSION").fetchall()
print(rows)
cursor.close()
connection.close()
If everything is working as expected, you should see output from the print
statement.