Mirroring channels and packages¶
Anaconda Enterprise enables you to create a local copy of a repository so users can access the packages from a centralized, on-premises location.
The mirror can be complete, partial, or include specific packages or types of packages. You can also create a mirror in an air gapped environment to help improve performance and security.
Note
It can take hours to mirror the full repository.
Before you can use Anaconda Enterprise’s convenient syncing tools to configure local mirrors for channels and packages, you’ll need to configure access to the source of the packages to be mirrored, whether an online repository or a tarball (if an airgapped installation).
Prerequisites:
Types of mirroring:
To create a complete mirror, see Mirroring the Anaconda repository or Mirroring a PYPI repository.
To create partial mirror, see Mirroring specific packages.
To mirror a repository in a system without internet access, see Mirroring in an air-gapped environment.
To share mirrors, see Configuring Anaconda Enterprise and Sharing channels.
Configuration options:
Log into Anaconda Enterprise as an existing user using the following command:
Note
If Anaconda Enterprise 5 is installed in a proxied environment, see Mirroring in a proxied environment for information on setting the NO_PROXY
variable.
Mirroring the Anaconda repository¶
We recommend the following process as a best practice for mirroring the Anaconda repository.
Instead of using the default
anaconda.yaml
file included in the mirror tool installation, create twoyaml
files, one for mirroring themain
channel, and another for mirroring thefree
channel.
Example main.yaml
file:
Example free.yaml
file:
If you saved both of these files to the home directory, you can use the following commands to mirror these channels. Otherwise, amend the path so that it corresponds to where you saved the files:
This mirrors all of the packages from these channels in the Anaconda repository. If the channel doesn’t already exist, it will be automatically created and shared with all authenticated users. You can customize the permissions on the mirrored packages by sharing the channel.
Tip
If you plan to mirror these channels on a regular basis, consider adding the -c
flag to get a clean mirror each time. This will automatically remove any packages that have been removed from the Anaconda repository between mirrors from your internal repository—excluding any packages your organization has blacklisted.
Verify that the mirror was successful by logging into your account and navigating to the Packages tab. You should see a list of the mirrored packages.
Mirroring a PyPI repository¶
The full PyPI mirror size is currently close to 4TB, so ensure that your file storage location has sufficient disk space before proceeding. Rather than mirror the entire PyPI repository, you can use a configuration file such as
$PREFIX/etc/anaconda-platform/mirrors/pypi.yaml
to customize the mirror behavior and specify the subset of packages you want to mirror.
To create a PyPI mirror:
This command loads the packages on https://pypi.org
into the pypi
user account. Mirrored
packages can be viewed at <https://anaconda.example.com>/repository/pypi/pypi/simple/
,
replacing <https://anaconda.example.com>
with the actual URL to your installation of Anaconda
Enterprise. (The second pypi
in the url should match the user
configuration value described
below.)
The following configuration options are available for you to customize your configuration file:
Name |
Description |
---|---|
|
The local user under which the PyPI packages are
imported. Default: |
|
A list of packages to mirror. Only packages listed
are mirrored. If this is set, |
|
A list of packages to mirror. Only packages listed
are mirrored. If the list is empty, all packages are
checked. Default: |
|
A list of packages to skip. The packages listed are
ignored. Default: |
|
Only download the latest versions of the packages.
Default: |
|
The URL of the PyPI mirror. |
|
A custom value for XML RPC URL. If this value is
present, it takes precedence over the URL built using
|
|
A custom value for the simple index URL. If this
value is present, it takes precedence over the URL
built using |
|
Whether to use the XML RPC API as specified by
PEP381.
If this is set to |
|
Whether to use the serial number provided by the XML
RPC API. Only packages updated since the last serial
saved are checked. If this is set to false, all PyPI
packages are checked for updates. Default: |
|
Create the mirror user as an organization instead of
a regular user account. All superusers are added to
the “Owners” group of the organization. Default:
|
Note that all mirrored PyPI-like channels are publicly available to pull packages from both inside and outside the cluster (i.e. no auth token required).
EXAMPLE:
Configuring pip¶
To configure pip to use this new mirror, create pip.conf
as follows:
replacing <https://anaconda.example.com>
with the actual URL to your Anaconda Enterprise.
To configure Anaconda Enterprise sessions and deployments to automatically use
the pip.conf
run the following command.
Alternately, if you can use the --index-url
flag directly when invoking pip.
For example,
replacing <https://anaconda.example.com>
with the actual URL to your Anaconda Enterprise
installation, and <package_name>
with the name of a package that is in your local mirror. In
the example URL, the second pypi
should match the user
configuration value described
above.
For more specific information on configuring pip, refer to the official documentation at https://pip.pypa.io/en/stable/user_guide/#config-file.
Mirroring specific packages¶
Alternately, you may not wish to mirror all packages. In this case, you can specify which platforms or specific packages you want to mirror —or— use the whitelist, blacklist or license_blacklist functionality to control which packages are mirrored, by editing the provided mirror files. You cannot combine these methods. For more information, see Mirror configuration options.
Mirroring R packages¶
An example configuration file for mirroring R packages is also provided:
Mirroring in an air-gapped environment¶
To mirror the repository in a system with no internet access, create a local
copy of the repository by extracting the airgapped tarball and point
cas-sync-api-v5
to the extracted tarball.
In this example we will
extract to /tmp
:
Now you have a local file-system repository located at /tmp/mirror/pkgs
. You can
mirror this repository by editing <path to cas-mirror>/etc/anaconda-platform/mirrors/anaconda.yaml
to contain:
And then run the command:
This mirrors the contents of the local file-system repository to your
Anaconda Enterprise installation under the username anaconda
.
Configuring Anaconda Enterprise¶
After creating the mirror, edit your Anaconda Enterprise configuration to add this new mirrored channel to the default Anaconda Enterprise channels and make the packages available to users.
Replacing <anaconda.example.com>
with the actual URL to your installation of Anaconda Enterprise.
Note
The ap-workspace
pod must be restarted for the configuration change to take effect on new project editor sessions.
To update the Anaconda Enterprise server with your changes, you’ll need to do the following:
Run the following command in an interactive shell to identify the pod associated with the workspace services:
Restart the workspace services by running the following command:
SSL verification¶
The mirroring tool uses two different settings for configuring SSL verification.
When the mirroring tool connects to its destination, it uses the ssl_verify
setting
from anaconda-enterprise-cli
to determine how to validate certificates. For example,
to use a custom certificate authority:
The mirroring tool uses conda’s configuration to determine how to validate certificates when connecting to the source that it is pulling packages from. For example, to disable certificate validation when connecting to the source:
Mirroring in a proxied environment¶
If Anaconda Enterprise 5 is installed in a proxied environment, set the
NO_PROXY
variable. This ensures the mirroring tool does not use the proxy when
communicating with the repository service, and prevents errors such as Max
retries exceeded
, Cannot connect to proxy
, and Tunnel connection failed:
503 Service Unavailable
.
Platform-specific mirroring¶
By default, the cas-sync-api-v5
tool mirrors all platforms. If you do
not need all platforms, edit the YAML file to specify the platform(s)
you want mirrored:
Note
The platform argument is evaluated before any other argument.
Package-specific mirroring¶
In some cases you may want to mirror only a small subset of the repository. Rather than blacklisting a long list of packages you do not want mirrored, you can instead simply enumerate the list of packages you DO want mirrored.
Note
This argument cannot be used with the blacklist
, whitelist
or license_blacklist
arguments—it can only be combined with platform-specific and version-specific mirroring.
EXAMPLE:
This example mirrors only the three packages: Accelerate, PyQt & Zope. All other packages will be completely ignored.
Python version-specific mirroring¶
Mirror the repository with a Python version or versions specified.
EXAMPLE:
Mirrors only Anaconda packages built for Python 3.3.
License blacklist mirroring¶
The mirroring script supports license blacklisting for the following license families:
EXAMPLE:
This example mirrors all the packages in the repository EXCEPT those that are GPL2-, GPL3-, or BSD-licensed, because those three licenses have been blacklisted.
Blacklist mirroring¶
The blacklist allows access to all packages EXCEPT those explicitly listed. If the license_blacklist
and blacklist
arguments are combined, license_blacklist
is evaluated first, and blacklist
is a supplemental modifier.
EXAMPLE:
This example mirrors the entire repository EXCEPT the bzip2
, Tk
,
and OpenSSL
packages.
Whitelist mirroring¶
The whitelist
argument adds or includes packages that would be otherwise excluded by the blacklist
and/or license_blacklist
functions.
EXAMPLE:
This example mirrors the entire repository EXCEPT any GPL2- or GPL3-licenses
packages, but includes readline
, despite the fact that it is GPL3-licensed.
Combining multiple mirror configurations¶
You may find that combining two or more of the arguments above is the easiest way to get the exact combination of packages that you want.
Note
The platform argument is evaluated before any other argument.
EXAMPLE: This example mirrors only Linux-64 distributions of the dnspython, Shapely and GDAL packages:
If the license_blacklist
and blacklist
arguments are combined, license_blacklist
is evaluated first, and blacklist
is a supplemental modifier.
EXAMPLE: In this example, the mirror configuration does not mirror GPL2-licensed
packages. It does not mirror the GPL3 licensed package pyqt
because it has
been blacklisted. It does mirror all other packages in the repository:
If the blacklist
and whitelist
arguments are both employed, the blacklist is
evaluated first, with the whitelist functioning as a modifier.
EXAMPLE: This example mirrors all packages in the repository except astropy
and pygments
.
Despite being listed on the blacklist, accelerate is mirrored because it is
listed on the whitelist.