Skip to main content

Deploy Nebari on Existing Kubernetes Clusters

Nebari can be deployed on existing Kubernetes clusters across major cloud providers (AWS EKS, Azure AKS, Google Cloud GKE) or custom Kubernetes installations. This guide walks you through the process step by step.

info

For bare metal deployments using K3s, see Deploy Nebari on Bare Metal with K3s.

Prerequisites

Before starting, ensure you have:

  • An existing Kubernetes cluster (EKS, AKS, GKE, or custom)
  • kubectl configured with access to your cluster
  • Nebari CLI installed (installation guide)
  • Appropriate node groups/pools with sufficient resources
  • DNS domain for your Nebari deployment

Overview

The deployment process follows these general steps:

  1. Evaluate your existing infrastructure
  2. Create/verify appropriate node groups
  3. Configure kubectl context
  4. Initialize Nebari configuration
  5. Configure Nebari for your cluster
  6. Deploy Nebari

Let's walk through this process for each cloud provider.

Evaluating the Infrastructure

Before deploying Nebari, review your existing cluster to ensure it meets the requirements.

AWS EKS Requirements

For this example, we assume you have an existing EKS cluster. If you need to create one, follow AWS's EKS setup guide.

Your existing EKS cluster should have:

  • VPC with at least three subnets in different Availability Zones
  • Subnets configured to automatically assign public IP addresses
  • IAM Role with the following policies:
    • AmazonEKSWorkerNodePolicy
    • AmazonEC2ContainerRegistryReadOnly
    • AmazonEKS_CNI_Policy

Additionally, for cluster autoscaling support, ensure the IAM role has the custom policy below:

Custom CNI and Autoscaling Policy (Click to expand)
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "eksWorkerAutoscalingAll",
"Effect": "Allow",
"Action": [
"ec2:DescribeLaunchTemplateVersions",
"autoscaling:DescribeTags",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeAutoScalingGroups"
],
"Resource": "*"
},
{
"Sid": "eksWorkerAutoscalingOwn",
"Effect": "Allow",
"Action": [
"autoscaling:UpdateAutoScalingGroup",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"autoscaling:SetDesiredCapacity"
],
"Resource": "*",
"Condition": {
"StringEquals": {
"autoscaling:ResourceTag/k8s.io/cluster-autoscaler/enabled": [
"true"
],
"autoscaling:ResourceTag/kubernetes.io/cluster/<your-cluster-name>": [
"owned"
]
}
}
}
]
}

Minimum Node Requirements:

  • General nodes: 8 vCPU / 32 GB RAM (e.g., t3.2xlarge)
  • User/Worker nodes: 4 vCPU / 16 GB RAM (e.g., t3.xlarge)
  • Storage: 200 GB EBS volume per node

Creating Node Groups

Nebari requires three types of node groups for optimal operation:

  • general: Core Nebari services (JupyterHub, monitoring, databases)
  • user: User JupyterLab notebook servers
  • worker: Dask distributed computing workers

Skip this step if appropriate node groups already exist.

Creating EKS Node Groups

Follow AWS's guide to create managed node groups.

General Node Group:

  • Name: general
  • Node IAM Role: The IAM role with policies described above
  • Instance type: t3.2xlarge or similar (8 vCPU / 32 GB RAM)
  • Disk size: 200 GB
  • Scaling: Min 1, Max 3, Desired 1
  • Subnets: Include all EKS subnets

User Node Group:

  • Name: user
  • Instance type: t3.xlarge or similar (4 vCPU / 16 GB RAM)
  • Disk size: 200 GB
  • Scaling: Min 0, Max 10, Desired 1
  • Enable autoscaling: Yes

Worker Node Group:

  • Name: worker
  • Instance type: t3.xlarge or similar (4 vCPU / 16 GB RAM)
  • Disk size: 200 GB
  • Scaling: Min 0, Max 20, Desired 1
  • Enable autoscaling: Yes

Using AWS CLI:

# Create general node group
aws eks create-nodegroup \
--cluster-name <cluster-name> \
--nodegroup-name general \
--node-role <iam-role-arn> \
--subnets <subnet-id-1> <subnet-id-2> <subnet-id-3> \
--scaling-config minSize=1,maxSize=3,desiredSize=1 \
--instance-types t3.2xlarge \
--disk-size 200 \
--labels nodegroup=general

# Create user node group
aws eks create-nodegroup \
--cluster-name <cluster-name> \
--nodegroup-name user \
--node-role <iam-role-arn> \
--subnets <subnet-id-1> <subnet-id-2> <subnet-id-3> \
--scaling-config minSize=0,maxSize=10,desiredSize=1 \
--instance-types t3.xlarge \
--disk-size 200 \
--labels nodegroup=user

# Create worker node group
aws eks create-nodegroup \
--cluster-name <cluster-name> \
--nodegroup-name worker \
--node-role <iam-role-arn> \
--subnets <subnet-id-1> <subnet-id-2> <subnet-id-3> \
--scaling-config minSize=0,maxSize=20,desiredSize=1 \
--instance-types t3.xlarge \
--disk-size 200 \
--labels nodegroup=worker

Configuring kubectl Context

Ensure you're using your cluster's kubectl context. Verify with:

kubectl config current-context

If you need to switch contexts:

kubectl config use-context <context-name>

To list all available contexts:

kubectl config get-contexts

Deploying Nebari

Now you're ready to initialize and deploy Nebari on your existing cluster.

Initialize Nebari Configuration

Initialize Nebari using the existing provider:

nebari init existing \
--project <project_name> \
--domain <domain_name> \
--auth-provider github

This creates a nebari-config.yaml file in your current directory.

Configure nebari-config.yaml

Update the configuration file with your EKS-specific settings. The key sections to modify are:

project_name: <project_name>
provider: existing
domain: <domain_name>

certificate:
type: lets-encrypt
acme_email: admin@example.com

security:
authentication:
type: GitHub
config:
client_id: <github-oauth-client-id>
client_secret: <github-oauth-client-secret>
oauth_callback_url: https://<domain_name>/hub/oauth_callback

local:
# Set this to your EKS cluster context name
kube_context: arn:aws:eks:<region>:<account-id>:cluster/<cluster-name>

# Configure node selectors based on your node group labels
node_selectors:
general:
key: eks.amazonaws.com/nodegroup
value: general
user:
key: eks.amazonaws.com/nodegroup
value: user
worker:
key: eks.amazonaws.com/nodegroup
value: worker

profiles:
jupyterlab:
- display_name: Small Instance
description: 2 CPU / 8 GB RAM
default: true
kubespawner_override:
cpu_limit: 2
cpu_guarantee: 1.5
mem_limit: 8G
mem_guarantee: 5G

- display_name: Medium Instance
description: 4 CPU / 16 GB RAM
kubespawner_override:
cpu_limit: 4
cpu_guarantee: 3
mem_limit: 16G
mem_guarantee: 10G

dask_worker:
Small Worker:
worker_cores_limit: 2
worker_cores: 1.5
worker_memory_limit: 8G
worker_memory: 5G
worker_threads: 2

Medium Worker:
worker_cores_limit: 4
worker_cores: 3
worker_memory_limit: 16G
worker_memory: 10G
worker_threads: 4
Complete example nebari-config.yaml for EKS (Click to expand)
project_name: my-nebari
provider: existing
domain: nebari.example.com

certificate:
type: lets-encrypt
acme_email: admin@example.com

security:
authentication:
type: GitHub
config:
client_id: your-github-client-id
client_secret: your-github-client-secret
oauth_callback_url: https://nebari.example.com/hub/oauth_callback

ci_cd:
type: github-actions
branch: main

terraform_state:
type: remote

namespace: dev

local:
kube_context: arn:aws:eks:us-west-2:123456789012:cluster/my-eks-cluster
node_selectors:
general:
key: eks.amazonaws.com/nodegroup
value: general
user:
key: eks.amazonaws.com/nodegroup
value: user
worker:
key: eks.amazonaws.com/nodegroup
value: worker

profiles:
jupyterlab:
- display_name: Small Instance
description: 2 CPU / 8 GB RAM
default: true
kubespawner_override:
cpu_limit: 2
cpu_guarantee: 1.5
mem_limit: 8G
mem_guarantee: 5G
image: quansight/nebari-jupyterlab:latest

- display_name: Medium Instance
description: 4 CPU / 16 GB RAM
kubespawner_override:
cpu_limit: 4
cpu_guarantee: 3
mem_limit: 16G
mem_guarantee: 10G
image: quansight/nebari-jupyterlab:latest

dask_worker:
Small Worker:
worker_cores_limit: 2
worker_cores: 1.5
worker_memory_limit: 8G
worker_memory: 5G
worker_threads: 2
image: quansight/nebari-dask-worker:latest

Medium Worker:
worker_cores_limit: 4
worker_cores: 3
worker_memory_limit: 16G
worker_memory: 10G
worker_threads: 4
image: quansight/nebari-dask-worker:latest

environments:
environment-default.yaml:
name: default
channels:
- conda-forge
dependencies:
- python=3.11
- ipykernel
- ipywidgets

Deploy Nebari

Deploy Nebari to your EKS cluster:

nebari deploy --config nebari-config.yaml

When prompted, update your DNS records to point your domain to the cluster's load balancer. Nebari will provide the necessary DNS configuration details during deployment.

Important Configuration Notes

Understanding kubernetes_context

The kube_context field in your nebari-config.yaml is critical—it tells Nebari which Kubernetes cluster to deploy to. This must exactly match a context name from your kubeconfig.

To find your context name:

kubectl config get-contexts

The output shows all available contexts. Use the value from the NAME column:

CURRENT   NAME                                                  CLUSTER                      AUTHINFO
* arn:aws:eks:us-west-2:123456789:cluster/my-cluster arn:aws:eks:... arn:aws:eks:...
gke_my-project_us-central1_my-cluster gke_my-project_... gke_my-project_...
my-aks-cluster my-aks-cluster clusterUser_...

Node Selectors

Node selectors ensure Nebari components are scheduled on the appropriate nodes:

  • general: Core services (JupyterHub, Prometheus, etc.) - require stable, always-on nodes
  • user: User notebook servers - benefit from autoscaling
  • worker: Dask workers - benefit from aggressive autoscaling for compute workloads

The node selector keys vary by provider:

  • AWS EKS: eks.amazonaws.com/nodegroup
  • Azure AKS: agentpool (default) or custom labels
  • GCP GKE: cloud.google.com/gke-nodepool (default) or custom labels

You can verify node labels with:

kubectl get nodes --show-labels

Verifying the Deployment

After deployment completes:

  1. Check pods are running:

    kubectl get pods -A
  2. Verify ingress is configured:

    kubectl get ingress -A
  3. Check services:

    kubectl get svc -A
  4. Access Nebari: Navigate to https://<your-domain> in your browser

Troubleshooting

Pods Stuck in Pending

If pods remain in Pending state:

kubectl describe pod <pod-name> -n <namespace>

Common causes:

  • Node selector mismatch: Labels in nebari-config.yaml don't match actual node labels
  • Insufficient resources: Nodes don't have enough CPU/memory
  • No nodes available: Node group/pool hasn't scaled up yet

Authentication Issues

If you can't log in to Nebari:

  1. Verify OAuth application credentials in your nebari-config.yaml
  2. Check the callback URL matches exactly: https://<domain>/hub/oauth_callback
  3. Review JupyterHub logs:
    kubectl logs -n <namespace> deployment/hub -f

LoadBalancer Service Pending

If the LoadBalancer service stays in Pending:

AWS EKS:

  • Verify subnets are tagged correctly for load balancer provisioning
  • Check AWS Load Balancer Controller is installed

Azure AKS:

  • Ensure the AKS cluster has permissions to create load balancers
  • Check resource group has available quota

GCP GKE:

  • Verify HTTP(S) Load Balancing is enabled on the cluster
  • Check firewall rules allow traffic on port 443

Next Steps

Additional Resources