Troubleshooting

Anaconda Enterprise provides detailed logs and monitoring information related to the Kubernetes services and containers it uses. You can use the Operations Center and Kubernetes CLI to access this information, to help diagnose and debug errors that you or other users may encounter while using the platform.


The Anaconda Enterprise cluster

As an Operations Center Admin, you can use the Operations Center to configure and monitor the platform.

To access the Operations Center:

  1. Log in to Anaconda Enterprise, select the Menu icon icon in the top right corner, and click the Administrative Console link displayed at the bottom of the slide out window.
  1. Click Manage Resources.
  2. Login to the Operations Center using the Administrator credentials configured after installation.

To view resource utilization:

  1. Select Servers in the menu on the left.
  2. Click on the Private IP address of the Anaconda Enterprise master node, and select SSH login as root.
_images/ssh_master_node.png

  1. To display the current resource utilization of each node in the cluster, run this command:

    kubectl top nodes --heapster-namespace=monitoring
    
_images/node_utilization.png

Note

This is actual resource utilization, not limits or requests.

  1. To view utilization and requests for a particular node, run the kubectl describe node command against the IP address for the node (listed under NAME). For example:

    kubectl describe node 172.31.25.175
    
_images/node_requests.png

  1. To view the resource utilization per pod, run this command:

    kubectl top pods --heapster-namespace=monitoring
    
_images/pod_utilization.png

  1. To view the current status of all pods in the cluster, run kubectl get pods.

_images/get_pods.png

The following table summarizes common pod states:

Status Description
Running The pod has been bound to a node, and at least one container is running.
Pending The pod is waiting for one or more container images to be created.
Terminating The pod is in the process of being terminated.
Error An error has occurred with the pod.
Init:CrashLoopBackoff The pod failed to start, and will make another attempt in a few minutes.
  1. To view information for a particular pod, run the kubectl describe pod command against the pod (listed under NAME). For example:

    kubectl describe pod anaconda-session-89747d7fdb154b89b182d5eaa25b2e59-7f497db55wl9g
    
_images/describe_pod.png

You can also use the Operations Center Logs to gain insights into pod behavior and troubleshoot issues. See logging for more information.


User errors

If a user experiences issues within a Notebook session, have them send you the name of the pod associated with their project session. They can obtain this information by running the hostname command from within a Jupyter Notebook or terminal window.

_images/notebook_hostname.png

_images/terminal_hostname.png

You can then use the commands described above or the Operation Center’s Monitoring and Logs features to investigate the issue. See Monitoring sessions and deployments for more information.


_images/monitoring-pods.png

Tip

As an Administrator, you can also use the Authentication Center to impersonate a user to try to reproduce the problem they are experiencing.


To access the Authentication Center:

  1. Login to Anaconda Enterprise, click the Menu icon icon in the top right corner, then click the Administrative Console link in the bottom of the slideout menu.
  2. Click Manage Users.
  3. In the Manage menu on the left, click Users.
  4. On the Lookup tab, click View all users to list every user in the system, or search the user database for all users that match the criteria you enter, based on their first name, last name, or email address.
_images/impersonate_users.png

  1. Click Impersonate in the row of Actions for the user to display a table of all Applications this user has interacted with on the platform, including editor sessions and deployments.
_images/user_applications.png

  1. Click the Anaconda Platform lik to interact with Anaconda Enterprise as the user.

See Managing users for more information on managing users.


Editor sessions

To help you troubleshoot issues with editor sessions, it might be helpful to understand what is happening “behind the scenes”.

  • When a user starts a session, Anaconda Enterprise launches the appropriate editor for them to work with their project files. In the background, the editor environment and other services are running in Docker containers.

  • To improve startup time for projects, the editor container includes conda environments for each of the project template environments provided by the platform. These environments are stored in /opt/continuum/anaconda/envs, along with any custom environments created during the editor session.

  • The project repository is cloned into /opt/continuum/project. (Only changes to files in this directory can be saved to the repository.)

  • The anaconda-project prepare command runs, scans the project’s anaconda-project.yml file for new packages and environments, and installs them into the running session.

    During this phase, you can monitor the progress by watching the output of /opt/continuum/preparing.

    When this process completes, the /opt/continuum/prepare.log is created.

Warning

Any changes made to the container image will be lost when the session stops, so any packages installed from the command line are available during the current session only. To persist package installs across sessions, they must be added to the project’s anaconda-project.yml file.