Skip to main content

Deploy Nebari on Bare Metal with K3s

This guide walks you through deploying Nebari on bare metal infrastructure using nebari-k3s, an Ansible-based solution that sets up a production-ready K3s cluster with KubeVIP and MetalLB.

Overview

The nebari-k3s project provides Ansible playbooks to:

  • Deploy a lightweight K3s Kubernetes cluster on bare metal servers
  • Configure KubeVIP for high-availability control plane
  • Set up MetalLB for load balancing
  • Prepare the cluster for Nebari deployment

This approach is ideal for:

  • On-premises deployments
  • Organizations with existing bare metal infrastructure
  • HPC environments transitioning from traditional batch systems
  • Cost-sensitive deployments requiring full hardware control
info

This solution replaces the deprecated nebari-slurm project, providing a modern Kubernetes-based alternative for bare metal deployments.

Prerequisites

Infrastructure Requirements

  • Minimum 3 bare metal servers (recommended for HA):

    • Control plane nodes: 8 vCPU / 32 GB RAM minimum
    • Worker nodes: 4 vCPU / 16 GB RAM minimum per node
    • 200 GB disk space per node
  • Network requirements:

    • All nodes on the same subnet
    • Static IP addresses assigned to each node
    • SSH access to all nodes
    • IP range reserved for MetalLB load balancer
    • Virtual IP address for the Kubernetes API server

Software Requirements

On your local machine (where you'll run Ansible):

  • Python 3.8+
  • Ansible 2.10+
  • kubectl
  • SSH key access to all nodes

On bare metal nodes:

  • Ubuntu 20.04+ or compatible Linux distribution
  • Passwordless sudo access for the SSH user
Running Ansible

Ansible requires a Linux/Unix environment. If your workstation runs Windows:

  • Use WSL2 (Windows Subsystem for Linux)
  • Deploy from one of your Linux nodes (e.g., the first control plane node)
  • Use a Linux VM or container

The deployment examples below assume you're running from a Linux environment with direct SSH access to all cluster nodes.

Step 1: Clone nebari-k3s Repository

git clone https://github.com/nebari-dev/nebari-k3s.git
cd nebari-k3s

Step 2: Configure Inventory

Create an Ansible inventory file describing your cluster:

# inventory.yml
all:
vars:
ansible_user: ubuntu
ansible_ssh_private_key_file: ~/.ssh/id_rsa

# K3s configuration
k3s_version: v1.28.5+k3s1
apiserver_endpoint: "192.168.1.100" # Virtual IP for API server

# KubeVIP configuration
kube_vip_tag_version: "v0.7.0"
kube_vip_interface: "ens5" # Network interface for VIP (default: ens5)
kube_vip_lb_ip_range: "192.168.1.200-192.168.1.220" # IPs for services

# MetalLB configuration
metal_lb_ip_range:
- "192.168.1.200-192.168.1.220"

children:
master:
hosts:
node1:
ansible_host: 192.168.1.101
node2:
ansible_host: 192.168.1.102
node3:
ansible_host: 192.168.1.103

node:
hosts:
node4:
ansible_host: 192.168.1.104
node5:
ansible_host: 192.168.1.105
node6:
ansible_host: 192.168.1.106

k3s_cluster:
children:
master:
node:

Advanced Configuration with Custom Data Directory

For production deployments, especially when using dedicated storage volumes, configure K3s to use a custom data directory. This is particularly important when:

  • You have multiple disks (OS disk and separate data disk)
  • You want to use high-performance storage for Kubernetes data
  • You need to manage disk space separately for system and application data

Create or update your group_vars/all.yaml:

---
# K3s version to install
# Check https://github.com/k3s-io/k3s/releases for available versions
k3s_version: v1.30.2+k3s2

# Ansible connection user (must have passwordless sudo on all nodes)
ansible_user: ubuntu

# Network interface used by flannel CNI for pod networking
# Run 'ip addr show' on your nodes to find the correct interface
flannel_iface: ens192

# ============ KubeVIP Configuration ============
# KubeVIP provides a virtual IP for the Kubernetes API server (HA)

# Enable ARP broadcasts for virtual IP
kube_vip_arp: true

# Network interface where the virtual IP will be configured
# Must match the interface with connectivity to other nodes
kube_vip_interface: ens192

# KubeVIP container image version
kube_vip_tag_version: v0.8.2

# Virtual IP address for Kubernetes API server
# This IP must be:
# - In the same subnet as your nodes
# - Not currently in use by any other device
# - Accessible from all nodes
apiserver_endpoint: 192.168.1.100

# ============ Cluster Security ============
# Shared secret token for K3s cluster nodes to authenticate
# IMPORTANT: Must be alphanumeric only (no special characters)
# Generate a secure random token: openssl rand -hex 20
k3s_token: your-secure-cluster-token

# ============ K3s Server Arguments ============
# Additional arguments passed to K3s server nodes (control plane)
extra_server_args: >-
--tls-san {{ apiserver_endpoint }}
--disable servicelb
--disable traefik
--write-kubeconfig-mode 644
--flannel-iface={{ flannel_iface }}
--data-dir /mnt/k3s-data

# --tls-san: Add virtual IP to API server TLS certificate
# --disable servicelb: Disable built-in load balancer (we use MetalLB)
# --disable traefik: Disable built-in ingress (Nebari installs its own)
# --write-kubeconfig-mode 644: Make kubeconfig readable
# --flannel-iface: Network interface for pod networking
# --data-dir: Custom location for K3s data (optional, see Step 2.1)

# ============ K3s Agent Arguments ============
# Additional arguments passed to K3s agent nodes (workers)
extra_agent_args: >-
--flannel-iface={{ flannel_iface }}
--data-dir /mnt/k3s-data

# ============ MetalLB Configuration ============
# MetalLB provides LoadBalancer services on bare metal

# MetalLB type: 'native' (recommended) or 'frr'
metal_lb_type: native

# MetalLB mode: 'layer2' (simple ARP-based) or 'bgp' (requires BGP router)
metal_lb_mode: layer2

# MetalLB speaker image version
metal_lb_speaker_tag_version: v0.14.8

# MetalLB controller image version
metal_lb_controller_tag_version: v0.14.8

# IP address range for LoadBalancer services
# Can be a string or list: "192.168.1.200-192.168.1.220" or ["192.168.1.200-192.168.1.220"]
# These IPs will be assigned to Nebari's ingress and other LoadBalancer services
# Requirements:
# - Must be in the same subnet as your nodes
# - Must not overlap with DHCP ranges or other static IPs
# - Reserve enough IPs for all services (typically 5-10 is sufficient)
metal_lb_ip_range: 192.168.1.200-192.168.1.220 # Can also be a list: ["192.168.1.200-192.168.1.220"]

Variable Reference Summary

VariableRequiredDefaultDescription
k3s_versionYes-K3s version to install
ansible_userYes-SSH user with sudo access
flannel_ifaceYes-Network interface for pod networking
kube_vip_interfaceYes-Network interface for virtual IP
kube_vip_tag_versionNov0.8.2KubeVIP image version
kube_vip_arpNotrueEnable ARP for virtual IP
apiserver_endpointYes-Virtual IP for Kubernetes API
k3s_tokenYes-Cluster authentication token (alphanumeric)
extra_server_argsNo-Additional K3s server arguments
extra_agent_argsNo-Additional K3s agent arguments
metal_lb_typeNonativeMetalLB implementation type
metal_lb_modeNolayer2MetalLB operating mode
metal_lb_ip_rangeYes-IP range for LoadBalancer services
metal_lb_speaker_tag_version: v0.14.8
metal_lb_controller_tag_version: v0.14.8
metal_lb_ip_range: 192.168.1.200-192.168.1.220 # Can also be a list: ["192.168.1.200-192.168.1.220"]

:::warning[Important: Custom Data Directory]
If you specify `--data-dir /mnt/k3s-data`, you **must** ensure this directory exists and is properly mounted on **all** nodes before running the Ansible playbook. See Step 2.1 below.
:::

### Step 2.1: Prepare Storage (Required for Custom Data Directory)

If you're using a custom data directory with dedicated storage volumes, prepare them on each node:

#### For worker nodes with separate data disks:

```bash
# On each node, identify the data disk
lsblk

# Format the disk (example: /dev/sdb - verify your disk name!)
sudo mkfs.ext4 /dev/sdb

# Create mount point
sudo mkdir -p /mnt/k3s-data

# Add to fstab for persistence
echo '/dev/sdb /mnt/k3s-data ext4 defaults 0 0' | sudo tee -a /etc/fstab

# Mount the disk
sudo mount -a

# Verify
df -h /mnt/k3s-data

For control plane with large storage requirements (using LVM):

If your control plane node needs flexible storage management (e.g., for backups, persistent volumes):

# Check available volume groups
sudo vgs

# Create logical volume (example: 1.4TB from existing volume group)
sudo lvcreate -L 1400G -n k3s-data ubuntu-vg

# Format with XFS for better performance with large files
sudo mkfs.xfs /dev/ubuntu-vg/k3s-data

# Create mount point
sudo mkdir -p /mnt/k3s-data

# Add to fstab using UUID for reliability
UUID=$(sudo blkid -s UUID -o value /dev/ubuntu-vg/k3s-data)
echo "UUID=$UUID /mnt/k3s-data xfs defaults 0 2" | sudo tee -a /etc/fstab

# Mount
sudo mount -a

# Verify
df -h /mnt/k3s-data
lsblk
Storage Recommendations
  • XFS: Better for large files and high I/O workloads (recommended for nodes with databases or large datasets)
  • ext4: General purpose, good default choice for most workloads
  • Leave space for expansion: Don't allocate 100% of available storage to allow for future growth
  • Consistent paths: Use the same mount point (/mnt/k3s-data) on all nodes

Step 2.2: Verify Network Interfaces

Ensure you're using the correct network interface names in your configuration:

# On each node, list network interfaces
ip addr show

# Common interface names:
# - ens192, ens160 (VMware)
# - eth0, eth1 (AWS, some bare metal)
# - eno1, eno2 (Dell, HP servers)

Update flannel_iface and kube_vip_interface in your group_vars/all.yaml to match your actual interface names.

Step 3: Run Ansible Playbook

Deploy the K3s cluster:

ansible-playbook -i inventory.yml playbook.yaml

This will:

  1. Install K3s on all nodes
  2. Configure the control plane with high availability
  3. Deploy KubeVIP for API server load balancing
  4. Install and configure MetalLB for service load balancing
  5. Set up proper node labels and taints
Known Issue: Multi-Master Join

There's a known issue in nebari-k3s where additional master nodes may fail to join the cluster correctly due to the IP filtering task returning multiple IPs. If you encounter this:

  1. Check that additional master nodes are running K3s:

    ssh user@node2 "sudo systemctl status k3s"
  2. Verify they can reach the first master node:

    ssh user@node2 "curl -k https://192.168.1.101:6443/ping"
  3. If a node is running but not joined, you may need to manually re-run the join command on that node or investigate the Ansible task that filters the flannel interface IP.

Step 4: Sync Kubeconfig

After the playbook completes, sync the kubeconfig to your local machine:

# Set environment variables
export SSH_USER="root" # Default: root (change if using different user)
export SSH_HOST="192.168.1.101" # IP of any master node
export SSH_KEY_FILE="~/.ssh/id_rsa"

# Sync kubeconfig
make kubeconfig-sync

Verify cluster access:

kubectl get nodes -o wide

You should see all your nodes in a Ready state.

Step 5: Label Nodes for Nebari

Nebari requires specific node labels for scheduling workloads. For optimal resource utilization and proper workload distribution, use the recommended node-role.nebari.io/group label:

# Label control plane/general nodes
kubectl label nodes node1 node2 node3 \
node-role.nebari.io/group=general

# Label user workload nodes
kubectl label nodes node4 \
node-role.nebari.io/group=user

# Label Dask worker nodes
kubectl label nodes node5 node6 \
node-role.nebari.io/group=worker
Node Labeling Best Practices
  • Consistent labeling: Using node-role.nebari.io/group as the label key ensures consistent behavior across all Nebari components
  • Multiple roles: A node can have multiple roles if needed (e.g., both user and worker on the same node)
  • Control plane nodes: Typically labeled as general to host core Nebari services
  • Resource optimization: Proper labeling enables Horizontal Pod Autoscaling (HPA) to fully utilize your cluster resources

Alternative labeling schemes (legacy):

# These also work but are less recommended
kubectl label nodes node1 node-role.kubernetes.io/general=true

Verify your labels:

kubectl get nodes --show-labels

Step 6: Initialize Nebari Configuration

Now initialize Nebari for deployment on your existing cluster:

nebari init existing \
--project my-nebari \
--domain nebari.example.com \
--auth-provider github

Step 7: Configure Nebari for Bare Metal

Edit the generated nebari-config.yaml to configure it for your K3s cluster:

project_name: my-nebari
provider: existing
domain: nebari.example.com

certificate:
type: lets-encrypt
acme_email: admin@example.com
acme_server: https://acme-v02.api.letsencrypt.org/directory

security:
authentication:
type: GitHub
config:
client_id: <github-oauth-app-client-id>
client_secret: <github-oauth-app-client-secret>
oauth_callback_url: https://nebari.example.com/hub/oauth_callback

local:
# Specify the kubectl context name from your kubeconfig
kube_context: "default" # Or the context name from your K3s cluster

# Configure node selectors to match your labeled nodes
node_selectors:
general:
key: node-role.nebari.io/group
value: general
user:
key: node-role.nebari.io/group
value: user
worker:
key: node-role.nebari.io/group
value: worker

# Configure default profiles
profiles:
jupyterlab:
- display_name: Small Instance
description: 2 CPU / 8 GB RAM
default: true
kubespawner_override:
cpu_limit: 2
cpu_guarantee: 1.5
mem_limit: 8G
mem_guarantee: 5G

- display_name: Medium Instance
description: 4 CPU / 16 GB RAM
kubespawner_override:
cpu_limit: 4
cpu_guarantee: 3
mem_limit: 16G
mem_guarantee: 10G

dask_worker:
Small Worker:
worker_cores_limit: 2
worker_cores: 1.5
worker_memory_limit: 8G
worker_memory: 5G
worker_threads: 2

Medium Worker:
worker_cores_limit: 4
worker_cores: 3
worker_memory_limit: 16G
worker_memory: 10G
worker_threads: 4

# Optional: Configure storage class
# default_storage_class: local-path # K3s default storage class

Important Configuration Notes

Kubernetes Context

The kube_context field must match the context name in your kubeconfig. To find available contexts:

kubectl config get-contexts

Use the name from the NAME column in the output.

Node Selectors

Node selectors tell Nebari where to schedule different types of workloads:

  • general: Core Nebari services (JupyterHub, monitoring, etc.)
  • user: User JupyterLab pods
  • worker: Dask worker pods for distributed computing

Make sure the key and value match the labels you applied to your nodes in Step 5.

Step 8: Deploy Nebari

Deploy Nebari to your K3s cluster:

nebari deploy --config nebari-config.yaml

During deployment, you'll be prompted to update your DNS records. Add an A record pointing your domain to one of the MetalLB IP addresses.

Step 9: Verify Deployment

Once deployment completes, verify all components are running:

kubectl get pods -A
kubectl get ingress -A

Access Nebari at https://nebari.example.com and log in with your configured authentication provider.

Troubleshooting

Pods Not Scheduling

If pods remain in Pending state:

kubectl describe pod <pod-name> -n <namespace>

Common issues:

  • Node selector mismatch: Verify labels match between nebari-config.yaml and actual node labels
  • Insufficient resources: Ensure nodes have enough CPU/memory available
  • Taints: Check if nodes have taints that prevent scheduling

LoadBalancer Services Pending

If services of type LoadBalancer remain in Pending state:

kubectl get svc -A | grep LoadBalancer

Verify MetalLB is running:

kubectl get pods -n metallb-system

Check MetalLB configuration:

kubectl get ipaddresspool -n metallb-system
kubectl get l2advertisement -n metallb-system

API Server Unreachable

If you cannot connect to the cluster:

  1. Verify KubeVIP is running on control plane nodes:

    ssh ubuntu@192.168.1.101 "sudo k3s kubectl get pods -n kube-system | grep kube-vip"
  2. Check if the virtual IP is responding:

    ping 192.168.1.100
  3. Verify the network interface is correct in your inventory configuration

Storage Considerations

K3s includes a default local-path storage provisioner that works well for development. For production:

  • Local storage: K3s local-path provisioner (default)
  • Network storage: Configure NFS, Ceph, or other storage classes
  • Cloud storage: If running in a hybrid environment, configure cloud CSI drivers

Example NFS storage class configuration:

# Add to nebari-config.yaml under theme.jupyterhub
storage_class_name: nfs-client

Storage Considerations

K3s includes a default local-path storage provisioner that works well for development. For production:

  • Local storage: K3s local-path provisioner (default)
  • Network storage: Configure NFS, Ceph, or other storage classes
  • Cloud storage: If running in a hybrid environment, configure cloud CSI drivers

Example NFS storage class configuration:

# Add to nebari-config.yaml under theme.jupyterhub
storage_class_name: nfs-client

Migrating Existing User Data

If you're migrating from an existing system (e.g., Slurm cluster), you can pre-populate user data:

  1. Copy data to the storage node (typically a control plane node with large storage):

    # From old system to new K3s storage
    rsync -avhP -e ssh /old/home/ user@k3s-node:/mnt/k3s-data/backup/home/
  2. Note about user IDs: User IDs in JupyterHub pods may differ from your existing system. After Nebari deployment:

    • Check the UID used by JupyterHub: kubectl exec -it jupyter-<username> -- id
    • Adjust file ownership if needed:
      # On the storage node
      sudo chown -R <jupyter-uid>:<jupyter-gid> /mnt/k3s-data/backup/home/<username>
  3. Create persistent volume for user data (if using custom storage):

    apiVersion: v1
    kind: PersistentVolume
    metadata:
    name: user-data-pv
    spec:
    capacity:
    storage: 1000Gi
    accessModes:
    - ReadWriteMany
    hostPath:
    path: /mnt/k3s-data/users
User Data Best Practices
  • Test data migration with a single user first
  • Verify file permissions match JupyterHub pod UIDs
  • Consider using NFS or similar for multi-node access to user data
  • Keep backups of original data during migration

Scaling Your Cluster

Adding Worker Nodes

  1. Add new nodes to your Ansible inventory
  2. Run the playbook targeting only new nodes:
    ansible-playbook -i inventory.yml playbook.yaml --limit new-node
  3. Label the new nodes for Nebari workloads

Upgrading K3s

To upgrade your K3s cluster:

  1. Update k3s_version in your inventory
  2. Run the playbook:
    ansible-playbook -i inventory.yml playbook.yaml
warning

Test upgrades in a non-production environment first. Always backup your data before upgrading.

Next Steps

Additional Resources