RKE2 Cluster Setup and Rancher Installation

Environment


Hostname	IP Address	Role
Alma1	192.168.2.44	Control Plane
Alma2	192.168.2.45	Worker Node 1
Alma3	192.168.2.46	Worker Node 2

OS: Alma Linux (provisioned via Proxmox cloud-init template)
Kubernetes Distribution: RKE2
Rancher: Installed via Helm with self-signed certificates
DNS: Pihole (internal)

RKE2 was installed on all 3 nodes via the RKE2 quickstart installer prior to the steps in this document. The steps below cover all post-install configuration required to get a fully working cluster with Rancher.

Part 1 — Control Plane Configuration

1.1 — Verify Firewall State

The Proxmox cloud-init template used to provision these VMs does not install firewalld. Verify this on each node before proceeding:

systemctl status firewalld

Expected output: Unit firewalld.service could not be found.

No firewall configuration is required. If firewalld is present and active in your environment, refer to the RKE2 networking requirements for required ports.

1.2 — Create the Control Plane Config File

Create the RKE2 configuration directory and config file on Alma1:

mkdir -p /etc/rancher/rke2

cat > /etc/rancher/rke2/config.yaml << 'EOF'
tls-san:
  - 192.168.2.44
node-ip: 192.168.2.44
EOF

⚠️ Important: The node-ip directive is required on Alma Linux. Without it, RKE2 may auto-detect the pod network interface (10.42.0.x) instead of your LAN interface, causing the kubelet to become unreachable and resulting in repeated 502 errors in the journal. Always explicitly set node-ip to the node's LAN IP on each node.

1.3 — Enable and Start the RKE2 Server Service

systemctl enable rke2-server
systemctl start rke2-server

Monitor startup progress:

journalctl -u rke2-server -f

Wait until you see:

rke2 is up and running

Then press Ctrl+C to exit. Verify the service is running:

systemctl status rke2-server

1.4 — Configure kubectl

RKE2 ships its own kubectl binary and kubeconfig file, but neither are in the PATH by default. Add them permanently:

cat >> /etc/profile.d/rke2.sh << 'EOF'
export PATH=$PATH:/var/lib/rancher/rke2/bin
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
EOF

chmod +x /etc/profile.d/rke2.sh
source /etc/profile.d/rke2.sh

Verify kubectl works:

kubectl get nodes

You should see alma1 with status Ready.

1.5 — Retrieve the Node Join Token

Worker nodes require this token to join the cluster. Run on Alma1 and copy the full output:

cat /var/lib/rancher/rke2/server/node-token

Part 2 — Worker Node Configuration

Repeat the following steps on Alma2 and Alma3, substituting the correct IP for each node.

2.1 — Create the Worker Config File

On Alma2 (192.168.2.45):

mkdir -p /etc/rancher/rke2

cat > /etc/rancher/rke2/config.yaml << 'EOF'
server: https://192.168.2.44:9345
token: <NODE_TOKEN>
node-ip: 192.168.2.45
EOF

On Alma3 (192.168.2.46):

mkdir -p /etc/rancher/rke2

cat > /etc/rancher/rke2/config.yaml << 'EOF'
server: https://192.168.2.44:9345
token: <NODE_TOKEN>
node-ip: 192.168.2.46
EOF

Replace <NODE_TOKEN> with the full token string retrieved in Step 1.5.

⚠️ Important: As with the control plane, node-ip must be explicitly set on each worker. Omitting it will cause the kubelet to bind to the pod network interface, resulting in NodeStatusUnknown and the node never reaching Ready state.

2.2 — Enable and Start the RKE2 Agent Service

Run on each worker node:

systemctl enable rke2-agent
systemctl start rke2-agent

Monitor the join process:

journalctl -u rke2-agent -f

Wait for log activity to stabilize, then press Ctrl+C.

Part 3 — Verify Cluster Health

Run the following on Alma1:

kubectl get nodes -o wide

Expected output — all 3 nodes with status Ready:

NAME    STATUS   ROLES                AGE   VERSION
alma1   Ready    control-plane,etcd   Xm    v1.34.6+rke2r3
alma2   Ready    <none>               Xm    v1.34.6+rke2r3
alma3   Ready    <none>               Xm    v1.34.6+rke2r3

Workers may take 1-2 minutes after the agent starts before transitioning to Ready.

Part 4 — Install Rancher

All commands in this section are run on Alma1 as root.

4.1 — Add Pihole DNS Record

Before installing Rancher, add an A record in Pihole:

rancher.home -> 192.168.2.44

Verify resolution from your workstation:

ping rancher.home

4.2 — Install Helm

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Verify:

helm version

4.3 — Add Required Helm Repositories

helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
helm repo add jetstack https://charts.jetstack.io
helm repo update

4.4 — Install cert-manager

Rancher requires cert-manager to manage its self-signed certificate lifecycle.

Install cert-manager CRDs:

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.16.3/cert-manager.crds.yaml

Create namespace and install via Helm:

kubectl create namespace cert-manager

helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --version v1.16.3

Verify all pods are running before proceeding:

kubectl get pods --namespace cert-manager

Expected output:

NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-xxxx                          1/1     Running   0          Xm
cert-manager-cainjector-xxxx               1/1     Running   0          Xm
cert-manager-webhook-xxxx                  1/1     Running   0          Xm

Do not proceed to the next step until all 3 cert-manager pods are Running. The webhook must be fully ready before Rancher installs.

4.5 — Install Rancher

Create the cattle-system namespace:

kubectl create namespace cattle-system

Install Rancher via Helm:

helm install rancher rancher-stable/rancher \
  --namespace cattle-system \
  --set hostname=rancher.home \
  --set bootstrapPassword=admin \
  --set ingress.tls.source=rancher

Wait for the rollout to complete:

kubectl rollout status deployment/rancher --namespace cattle-system

Expected output:

deployment "rancher" successfully rolled out

Verify all Rancher pods are running:

kubectl get pods --namespace cattle-system

Expected output:

NAME                               READY   STATUS      RESTARTS   AGE
helm-operation-xxxx                2/2     Running     0          Xm
rancher-xxxx                       1/1     Running     0          Xm
rancher-xxxx                       1/1     Running     0          Xm
rancher-xxxx                       1/1     Running     0          Xm
rancher-webhook-xxxx               1/1     Running     0          Xm

helm-operation pods with Completed status are normal — these are post-install jobs run by Rancher and can be ignored.

⚠️ Hostname note: Rancher's nginx ingress only responds to requests matching the configured hostname. If you add a CNAME or alternate DNS entry pointing to the same IP but with a different hostname, nginx will return a default error page rather than routing to Rancher. If you need to change the hostname after installation, run a helm upgrade with the new hostname value (see below).

4.6 — Changing the Rancher Hostname After Install

If you need to update the hostname Rancher responds to (e.g. changing from rancher.local to rancher.home), run a Helm upgrade:

helm upgrade rancher rancher-stable/rancher \
  --namespace cattle-system \
  --set hostname=rancher.home \
  --set bootstrapPassword=admin \
  --set ingress.tls.source=rancher

Then wait for the rollout:

kubectl rollout status deployment/rancher --namespace cattle-system

Verify the ingress picked up the new hostname:

kubectl get ingress --namespace cattle-system

Navigate to the Rancher UI in a browser:

https://rancher.home

You will see a self-signed certificate warning — this is expected. Accept/proceed through it.

On first login:

Enter the bootstrap password: admin
Set a new permanent password when prompted
Confirm the server URL as https://rancher.home when prompted

Troubleshooting Reference


Symptom	Likely Cause	Resolution
Repeated `502` errors in `rke2-server` journal, targeting a `10.42.x.x` IP	`node-ip` not set, kubelet binding to pod network interface	Add `node-ip` to `/etc/rancher/rke2/config.yaml` and restart the service
Worker node stuck in `NodeStatusUnknown` / `Kubelet stopped posting node status`	Agent joined during a broken control plane run, or `node-ip` missing on worker	Stop `rke2-agent`, fix `config.yaml` with correct `node-ip`, restart agent
Rancher UI returns nginx error page when accessing via alternate hostname	Ingress only matches the configured `hostname` value	Run `helm upgrade` with the correct hostname, or add the alternate hostname via upgrade
cert-manager webhook errors during Rancher install	cert-manager not fully ready before Rancher install started	Wait for all 3 cert-manager pods to be `Running` before installing Rancher
Worker node never appears in `kubectl get nodes`	Wrong token or wrong control plane IP in worker `config.yaml`	Verify token and server IP, restart `rke2-agent`

RKE2 Cluster Setup and Rancher Installation

Environment

Part 1 — Control Plane Configuration

1.1 — Verify Firewall State

1.2 — Create the Control Plane Config File

1.3 — Enable and Start the RKE2 Server Service

1.4 — Configure kubectl

1.5 — Retrieve the Node Join Token

Part 2 — Worker Node Configuration

2.1 — Create the Worker Config File

2.2 — Enable and Start the RKE2 Agent Service

Part 3 — Verify Cluster Health

Part 4 — Install Rancher

4.1 — Add Pihole DNS Record

4.2 — Install Helm

4.3 — Add Required Helm Repositories

4.4 — Install cert-manager

4.5 — Install Rancher

4.6 — Changing the Rancher Hostname After Install

Part 5 — First Login

Troubleshooting Reference