Project configurations#
In Data Science & AI Workbench, projects are structured around a core configuration file called anaconda-project.yml
. This file is crucial for orchestrating a project’s components for deployment and ensuring operational consistency over time. There are several parameters that must be included within each project’s anaconda-project.yml
file to ensure that it operates as intended:
Packages - You must specify all conda or pip packages the project requires to function in its
anaconda-project.yml
file. By default, Workbench is configured to use packages from its internal repository to create project environments. However, it is possible to use packages from external repositories.Environment - You must define at least one named environment to accurately manage the project’s packages and their dependencies, ensuring stability across different settings. Projects use template environments when they are initially created, which can be updated or replaced. For more information about template environments, see Configuring persistent environments and sample projects.
Commands - You must define at least one command to properly deploy and run jobs for your project in its intended environment.
Environmental variables - If necessary, set up the required environment variables needed to control how your project interacts with external resources and services.
Caution
It is possible to edit a project’s anaconda-project.yml
file manually to add the required configurations; however, this method is prone to human error, especially for users who are unfamiliar with .yml
file formatting.
Instead, Anaconda recommends using anaconda-project
commands from a terminal within your project to update its configurations when possible. For more information about anaconda-project
, see the official documentation here.
All anaconda-project
commands must be run from the lab_launch
environment! Enter the lab_launch
environment by running the following command in a project terminal:
conda activate lab_launch
Once you are finished configuring your project, Anaconda recommendeds that you add a lock file to your project to ensure its reproducibility across different environments at scale.
Configuring project environments#
The conda environments in standard project templates are pre-solved to reduce initialization time when additional packages are added. However, you might want to create an environment specifically for your project.
To create a new environment with specific packages and add it to your project:
Create a new project.
Start a session.
Open a terminal within your session editor.
Create an environment and include the packages you need for it by running the following command:
# Replace <ENV_NAME> with the name of the environment you are creating to add to your project's configuration # Replace <PACKAGE_NAME> with the name of the packages you want to add to your project's configuration anaconda-project add-env-spec --name <ENV_NAME> <PACKAGE_NAME> <PACKAGE_NAME>
Remove the template environment that you used to create your project by running the following command:
# Replace <TEMPLATE_ENV> with the name of the template environment you used to initially create the project anaconda-project remove-env-spec --name <TEMPLATE_ENV>
Stop and re-start the project.
Caution
To edit and run notebooks in Jupyter Notebook or JupyterLab, you must include the notebook
package in your project’s environment.
Verify your environment is initialized for notebooks
Open a terminal within your session editor.
Run the following commands:
cd /opt/continuum/ ls
If the environment is being initialized, you will see a file named preparing
. Once initialization is complete, you will see a file named prepare.log
. To troubleshoot environment initialization, view the log from the terminal by running the following command:
cat /opt/continuum/prepare.log
Configuring project packages#
Adding a package to a project’s configuration file persists for future project sessions and deployments. This is different than using conda install
to add a package using the conda environment during a session, which impacts the project during the current session only.
Note
Networks that are air-gapped (operate without internet access) must mirror the Anaconda repository into your organization’s internal package repository to provide them to users.
Adding conda packages#
To add a conda package to your project’s anaconda-project.yml
file:
Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the
lab_launch
environment.Run the following command:
# Replace <PACKAGE_NAME> with the name of the packages you want to add to your project's configuration anaconda-project add-packages <PACKAGE_NAME> <PACKAGE_NAME>
# Replace <CHANNEL_NAME> with the name of the channel that contains the packages you want to add # Replace <PACKAGE_NAME> with the name of the packages you want to add to your project's configuration anaconda-project add-packages --channel <CHANNEL_NAME> <PACKAGE_NAME> <PACKAGE_NAME>
To use packages from an external repository, you will need to specify the full channel URL in the command:
# Replace <CHANNEL_URL> with the address of the channel that contains the packages you want to add # Replace <PACKAGE_NAME> with the name of the packages you want to add to your project's configuration anaconda-project add-packages <PACKAGE_NAME> --channel <CHANNEL_URL>
The command may take a moment to run as it solves the environment to collect dependencies and download packages. Once complete, the added packages appear in the project’s
anaconda-project.yml
file. If the file is open when you run the command, close and reopen it to view your changes.Stop and re-start the project session.
Adding pip packages#
If your project requires you to pip
install a package, you can use anaconda-project
to add it to your project’s configuration.
To add a pip
package to your project’s anaconda-project.yml
file:
Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the
lab_launch
environment.Run the following command:
# Replace <PACKAGE_NAME> with the name of the packages you want to add to your project's configuration anaconda-project add-packages --pip <PACKAGE_NAME> <PACKAGE_NAME>
Stop and re-start the project session.
Removing packages#
To remove a package from your project’s anaconda-project.yml
file:
Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the
lab_launch
environment.Run the following command:
# Replace <PACKAGE_NAME> with the name of the packages you want to remove from your project's configuration anaconda-project remove-packages <PACKAGE_NAME> <PACKAGE_NAME>
Stop and re-start the project session.
Configuring project environment variables#
Environment variables are key parameters that manage dynamic settings like API keys, database URLs, and memory limits without modifying the codebase. These variables are essential for deploying projects consistently.
To add environment variables with a default value to your project’s anaconda-project.yml
file:
Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the
lab_launch
environment.Run the following command:
# Replace <VALUE> with the content of your environment variable # Replace <VARIABLE> with the variable name anaconda-project add-variable --default=<VALUE> <VARIABLE>
Stop and re-start the project session.
For more information and advanced command arguments, see Working with environment variables in the official anaconda-project
documentation.
Configuring project commands#
To deploy a project in Workbench, it must contain at least one appropriate deployment command defined in its anaconda-project.yml
file. These commands specify how the project’s components, such as notebooks, scripts, or generic web frameworks, should be executed when the project is deployed.
To add a command to your project’s anaconda-project.yml
file:
Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the
lab_launch
environment.Run the following command:
# Replace <CMD_NAME> with a name for your deployment command # Replace <COMMAND> with the project filename that should be executed anaconda-project add-command <CMD_NAME> <COMMAND>
Stop and re-start the project session.
For more information and advanced command arguments, see Working with commands in the official anaconda-project
documentation.
Example deployment commands
The following are example deployment commands you can use:
For a Notebook:
commands: default: notebook: <FILE_NAME>.ipynbFor a Panel dashboard:
commands: default: unix: panel serve <SCRIPT_OR_NOTEBOOK_FILE> supports_http_options: TrueFor a generic script or web framework, including Python or R:
commands: default: unix: bash <YOUR-SCRIPT>.sh supports_http_options: truecommands: default: unix: python <YOUR-SCRIPT>.py supports_http_options: truecommands: default: unix: Rscript <YOUR-SCRIPT>.R supports_http_options: true
Validating project deployment commands#
To validate the anaconda-project.yml
and verify your project will deploy successfully:
Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the
lab_launch
environment.Prepare the environment and test the deployment command by running the following commands:
# Replace <ENV_NAME> with the name of the project's environment # Replace <COMMAND> with the deployment command that you want to test anaconda-project prepare --env-spec <ENV_NAME> anaconda-project run <COMMAND>
Any errors preventing a successful deployment are displayed in the terminal.
Testing project deployments#
Once deployment commands are added to your project, you can test the deployment using the test_deployment
command. This sets up a mini web application, allowing you to preview your deployment locally using a port within your session.
To test a project deployment:
Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the
lab_launch
environment.Test a deployment command you’ve added to your project by running the following command:
# Replace <COMMAND> with an available deployment command test_deployment <COMMAND>
Note
If you do not supply a deployment command to test, the first command listed under the
commands:
section of the projects.yml
configuration file will be run.Navigate to the web address returned by the command to verify your project deployed successfully.
Locking project configurations#
Project locking is a crucial step in ensuring your project is reproducible across multiple deployments at scale. It is best practice to lock your project once you have finalized configurations for your project, or if you are preparing to transition to a production or public deployment. For more information, see Project reproducibility in Workbench.
To lock your anaconda-project.yml
file configurations to a fixed state:
Open your project.
If necessary, start a session.
Open a terminal within your session editor.
Verify that you are in the
lab_launch
environment.Lock your project configurations by running the following command:
anaconda-project lock
This instructs conda to solve the project’s environment, lock all packages and their dependencies to their current versions, and generates an anaconda-project-lock.yml
file for your project.