Configuring workload resource profiles#

Each project editor session and deployment consumes compute resources within the Anaconda Enterprise cluster. If you need to run applications that require more memory or compute power than Anaconda provides by default, you can create customized resource profiles and configure the number of cores and amount of memory/RAM available for them. You can then allow users to access your customized resource profiles either globally or based on their role, assigned group, or as individual users.

For example, if your installation includes nodes with GPUs, you can add a GPU resource profile so users can access the GPUs to accelerate computation within their projects, which is essential for AI/ML model training.

Resource profiles that you create are listed for users when they create or deploy a project. You can create as many resource profiles as needed.

  1. Connect to your instance of Anaconda Enterprise.

  2. View a list of your configmaps by running the following command:

    kubectl get cm
    
  3. Edit the anaconda-enterprise-anaconda-platform.yml file.

    kubectl edit cm anaconda-enterprise-anaconda-platform
    

    Caution

    Anaconda recommends making a backup copy of this file before you edit it. Any changes you make will impact how Anaconda Enterprise functions.

  4. Find the resource-profiles: section of the file.

  5. Add any additional resources using the following examples as a template for your resource profiles, then customize them for your environment:

    Resource profile examples
    resource-profiles:
      resource-profile:
        description: Custom resource profile (global)
        resources:
          limits:
            cpu: "2"
            memory: 4096Mi
        user_visible: true
    
      roles_profile:
        description: Custom resource profile (roles)
        resources:
          limits:
            cpu: "3"
            memory: 4096Mi
        user_visible: true
        acl:
          roles:
            - ae-creator
    
      groups_profile:
        description: Custom resource profile (group)
        resources:
          limits:
            cpu: "4"
            memory: 4096Mi
        user_visible: true
        acl:
          groups:
            - managers
    
      users_profile:
        description: Custom resource profile (users)
        resources:
          limits:
            cpu: "1"
            memory: 4096Mi
        user_visible: true
        acl:
          users:
            - user2
    
      not_visible_to_anyone:
        description: Custom resource profile (nv)
        resources:
          limits:
            cpu: "3"
            memory: 4096Mi
        user_visible: false
    
      gpu-profile:
        description: GPU resource profile
        resources:
          limits:
            cpu: "4"
            memory: 8Gi
            nvidia.com/gpu: 1
          requests:
            cpu: "1"
            memory: 2048Mi
            nvidia.com/gpu: 1
        user_visible: true
    

    Note

    Resource profiles display their description: as their name. Profiles are listed in alphabetical order, after the default profile.

  6. (Optional) By default, CPU sessions and deployments are allowed to run on GPU nodes. To reserve your GPU nodes for sessions and deployments that require them, comment out the affinity: specification in the file as shown:

  7. (Optional) If you need to schedule user workloads on a specific node, add a node_selector to your resource profile. Use node selectors when running different CPU types, such as Intel and AMD; or different GPU types, such as Tesla v100 and p100. To enable a node selector, add node_selector to the bottom of your resource profile, with the model: value matching the label you have applied to your worker node.

    GPU node selector example
    gpu-profile:
      description: GPU resource profile
      resources:
        limits:
          cpu: "4"
          memory: 8Gi
          nvidia.com/gpu: 1
        requests:
          cpu: "1"
          memory: 2048Mi
          nvidia.com/gpu: 1
      user_visible: true
      node_selector:
        model: v100
    
  8. Save your changes to the file.

  9. Restart the workspace and deploy services by running the following command:

    kubectl delete pods -l 'app in (ap-workspace, ap-deploy)'
    

Anaconda’s Kubernetes Helm chart encapsulates application definitions, dependencies, and configurations into a single values.yaml file. For more information about the Helm chart, see Helm values template.

  1. Connect to your instance of Anaconda Enterprise.

  2. Save your current configurations using the extract_config.sh script by running the following command:

    # Replace <NAMESPACE> with the namespace Anaconda Enterprise is installed in
    NAMESPACE=<NAMESPACE> ./extract_config.sh
    

    Note

    The extract_config.sh script creates a file called helm_values.yaml and saves it in the directory where the script was run.

  3. Verify that the information captured in helm_values.yaml file contains your current cluster configuration settings.

  4. Create a new section at the bottom of the helm_values.yaml file and add any additional resources using the following examples as a template for your resource profiles, then customize them for your environment:

    Resource profile examples
    resource-profiles:
      resource-profile:
        description: Custom resource profile (global)
        resources:
          limits:
            cpu: "2"
            memory: 4096Mi
        user_visible: true
    
      roles_profile:
        description: Custom resource profile (roles)
        resources:
          limits:
            cpu: "3"
            memory: 4096Mi
        user_visible: true
        acl:
          roles:
            - ae-creator
    
      groups_profile:
        description: Custom resource profile (group)
        resources:
          limits:
            cpu: "4"
            memory: 4096Mi
        user_visible: true
        acl:
          groups:
            - managers
    
      users_profile:
        description: Custom resource profile (users)
        resources:
          limits:
            cpu: "1"
            memory: 4096Mi
        user_visible: true
        acl:
          users:
            - user2
    
      not_visible_to_anyone:
        description: Custom resource profile (nv)
        resources:
          limits:
            cpu: "3"
            memory: 4096Mi
        user_visible: false
    
      gpu-profile:
        description: GPU resource profile
        resources:
          limits:
            cpu: "4"
            memory: 8Gi
            nvidia.com/gpu: 1
          requests:
            cpu: "1"
            memory: 2048Mi
            nvidia.com/gpu: 1
        user_visible: true
    

    Note

    Resource profiles display their description: as their name. Profiles are listed in alphabetical order, after the default profile.

  5. (Optional) By default, CPU sessions and deployments are allowed to run on GPU nodes. To reserve your GPU nodes for sessions and deployments that require them, comment out the affinity: configuration in the file as shown:

  6. (Optional) If you need to schedule user workloads on a specific node, add a node_selector to your resource profile. Use node selectors when running different CPU types, such as Intel and AMD; or different GPU types, such as Tesla v100 and p100. To enable a node selector, add node_selector to the bottom of your resource profile, with the model: value matching the label you have applied to your worker node.

    GPU node selector example
    gpu-profile:
      description: GPU resource profile
      resources:
        limits:
          cpu: "4"
          memory: 8Gi
          nvidia.com/gpu: 1
        requests:
          cpu: "1"
          memory: 2048Mi
          nvidia.com/gpu: 1
      user_visible: true
      node_selector:
        model: v100
    
  7. Perform a Helm upgrade by running the following command:

    helm upgrade --values ./helm_values.yaml anaconda-enterprise ./Anaconda-Enterprise/
    

Open a project and view its settings to verify that the resource profiles you added appear in the Resource Profile dropdown.