Self hosted kubernetes loadbalancing with cilium

Self hosted kubernetes loadbalancing with cilium

I have switched from calico CNI to cilium mostly because cilium comes with an ability to create a loadBalancer type of service without installing any third party application that is not possible using calico. With calico I was using metallb to have the ability to create loadbalancer.

In order to test this I was creating a kubernetes cluster using kind. I used the following config:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4                               
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
networking:
  disableDefaultCNI: true

That created a 4 node cluster one control plane and 3 worker nodes without CNI installed.
In order to do that I was running the following command:

#  kind create cluster --name test --config=kind-config.yaml
Creating cluster "test" ...
 ✓ Ensuring node image (kindest/node:v1.27.3) 🖼
 ✓ Preparing nodes 📦 📦 📦 📦  
 ✓ Writing configuration 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing StorageClass 💾 
 ✓ Joining worker nodes 🚜 
Set kubectl context to "kind-test"
You can now use your cluster with:

kubectl cluster-info --context kind-test
#  kubectl get nodes -o wide               
NAME                 STATUS     ROLES           AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION    CONTAINER-RUNTIME
test-control-plane   NotReady   control-plane   106s   v1.27.3   172.18.0.5    <none>        Debian GNU/Linux 11 (bullseye)   6.6.26-linuxkit   containerd://1.7.1
test-worker          NotReady   <none>          83s    v1.27.3   172.18.0.2    <none>        Debian GNU/Linux 11 (bullseye)   6.6.26-linuxkit   containerd://1.7.1
test-worker2         NotReady   <none>          83s    v1.27.3   172.18.0.4    <none>        Debian GNU/Linux 11 (bullseye)   6.6.26-linuxkit   containerd://1.7.1
test-worker3         NotReady   <none>          84s    v1.27.3   172.18.0.3    <none>        Debian GNU/Linux 11 (bullseye)   6.6.26-linuxkit   containerd://1.7.1

As can be seen the nodes are in NotReady state because they don't have a CNI deployed. In order to deploy cilium CNI we need to download the cilium cli tool and run cilium install

# cilium install --version 1.15.4
🔮 Auto-detected Kubernetes kind: kind
✨ Running "kind" validation checks
✅ Detected kind version "0.20.0"
ℹ️  Using Cilium version 1.15.4
🔮 Auto-detected cluster name: kind-test
🔮 Auto-detected kube-proxy has been installed

To check the status of the cni we can run cilium status

# ─ cilium status --wait           
    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    disabled (using embedded mode)
 \__/¯¯\__/    Hubble Relay:       disabled
    \__/       ClusterMesh:        disabled

Deployment             cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
DaemonSet              cilium             Desired: 4, Ready: 4/4, Available: 4/4
Containers:            cilium             Running: 4
                       cilium-operator    Running: 1
Cluster Pods:          3/3 managed by Cilium
Helm chart version:    
Image versions         cilium             quay.io/cilium/cilium:v1.15.4@sha256:b760a4831f5aab71c711f7537a107b751d0d0ce90dd32d8b358df3c5da385426: 4
                       cilium-operator    quay.io/cilium/operator-generic:v1.15.4@sha256:404890a83cca3f28829eb7e54c1564bb6904708cdb7be04ebe69c2b60f164e9a: 1

Optional if we want to make absolutely sure everything is ready we can run a connectivity test which would take alot of time by running:

# cilium connectivity test

For saving time I will skip this step and let's check the node status again:

# kubectl get nodes -o wide
NAME                 STATUS   ROLES           AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION    CONTAINER-RUNTIME
test-control-plane   Ready    control-plane   9m35s   v1.27.3   172.18.0.5    <none>        Debian GNU/Linux 11 (bullseye)   6.6.26-linuxkit   containerd://1.7.1
test-worker          Ready    <none>          9m12s   v1.27.3   172.18.0.2    <none>        Debian GNU/Linux 11 (bullseye)   6.6.26-linuxkit   containerd://1.7.1
test-worker2         Ready    <none>          9m12s   v1.27.3   172.18.0.4    <none>        Debian GNU/Linux 11 (bullseye)   6.6.26-linuxkit   containerd://1.7.1
test-worker3         Ready    <none>          9m13s   v1.27.3   172.18.0.3    <none>        Debian GNU/Linux 11 (bullseye)   6.6.26-linuxkit   containerd://1.7.1

As can be see the nodes are in Redy state and everything looks right.

Next step would be to configure the loadbalancer IP pool that we want to use with this cluster. Since this cluster is running on docker will use the range from the docker network that is 172.18.0.0/24

# cat << EOF | kubectl apply -f -                   
apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
  name: "test-pool"
spec:
  cidrs:
  - cidr: "172.18.0.0/24"
EOF
#  kubectl get CiliumLoadBalancerIPPool
NAME        DISABLED   CONFLICTING   IPS AVAILABLE   AGE
test-pool   false      False         254             5s

I have created an ip pool for the loadbalancer to use.
Note: By default the loadbalancer will assign the first IP address available in the pool and if you are not careful that would conflict with either some gateway or some docker containers having the same IP address assigned. We can however manually choose the ip address that we want to assign to the loadbalancer by using annotations.

For testing purpose I will I will deploy an nginx application with a loadbalancer service and check if I can connect to the nginx using the loadbalancer.

# apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:     
    role: nginx                            
spec:
  replicas: 3
  selector:    
    matchLabels:
      role: nginx  
  template:   
    metadata:       
      labels:       
        role: nginx
    spec:           
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80
EOF

#  kubectl get pods                    
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-7479b9f975-49hfk   1/1     Running   0          95s
nginx-deployment-7479b9f975-5th4d   1/1     Running   0          95s
nginx-deployment-7479b9f975-8xg9m   1/1     Running   0          95s

The nginx application is up and running now let's create the LoadBalancer type of service.

# cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: my-service
  annotations:
    "io.cilium/lb-ipam-ips": "172.18.0.200"
spec:
  selector:
    role: nginx
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
  type: LoadBalancer
EOF
#  kubectl get svc                   
NAME         TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)        AGE
kubernetes   ClusterIP      10.96.0.1      <none>         443/TCP        26m
my-service   LoadBalancer   10.96.225.56   172.18.0.200   80:31457/TCP   4s

As can be seen the LoadBalancer type of service was created and got assigned an EXTERNAL-IP that I choose and added via the annotations to the my-service service manifest. It is time to test the application.

Since this is running in docker there is no direct access to the cluster without doing some modifications to the cluster config and redeploying the cluster. For that reason in this case I will be using node-shell plugin with kubectl to connect to one of the nodes and run a curl on the loadbalancer ip address.

# kubectl node-shell test-worker -- curl http://172.18.0.200
spawning "nsenter-dfpjch" on "test-worker"
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
pod "nsenter-dfpjch" deleted

As result I have successfully accessed the loadbalance and displayed hte content of the webpage handled by the nginx application running in the kubernetes cluster via the loadbalancer type of service called my-service.