1. Infrastructure Charts
  2. cluster

cluster

Deploys cluster with Cluster API and Kubevirt, including addons.

Deploys a Kubernetes cluster with Cluster API and KubeVirt.

Includes the following addons:

Prerequisites

The chart requires the following addons to be installed on the host cluster:

This chart is built to work specifically for the Perseus Cloud project, and unless you’re running a cluster of the project this chart will likely not work for you.

The chart requires a virtual machine image built from the Kubernetes image builder project available for download over HTTP(S). Make sure to set config.image to the url of the virtual machine image.

Secrets containing Cloudflare token and CoreWarden API credentials on the host cluster are required. Make sure to create the secrets, specify the namespace of the secrets in config.externalSecrets.remoteNamespace, and authorize access to the secrets via config.rbac.

Tutorial

This tutorial will take you through creating a cluster on the Perseus Cloud platform.

The cluster chart needs a domain name for Gateway, Cert Manager and External DNS functionality. This tutorial will use example.com.

Create a token for CloudFlare DNS with edit access to the DNZ zone of the domain you will be using for the cluster. Insert it into a secret in the tenant-secrets namespace:

apiVersion: v1
kind: Secret
metadata:
  name: cert-manager
  namespace: tenant-secrets
type: Opaque
data:
  cloudflareToken: REDACTED

Create a CoreWarden DNS service account with edit access to the DNZ zone of the domain you will be using for the cluster. Insert it into a secret in the tenant-secrets namespace:

apiVersion: v1
kind: Secret
metadata:
  name: cert-manager
  namespace: tenant-secrets
type: Opaque
data:
  id: REDACTED
  secret: REDACTED

Create a values file with the following content. Replace exmaple.com with your domain name.

# Inside values.yaml
config:
  externalDNSWebhook:
    zones:
      - example.com.
  gateway:
    hostname: "*.example.com"

Install the chart:

helm upgrade --install cluster oci://ghcr.io/sneakybugs/cluster --version 6.0.2 --values values.yaml

Wait for all Argo Applications to become ready.

Get the Kubeconfig and use it:

kubectl get secret cluster-kubeconfig -o json | jq -r .data.value | base64 -d > tenant.conf
export KUBECONFIG="$(pwd)/tenant.conf"

Now you can access the tenant cluster API server:

kubectl get pods -A

You now have a fully functional cluster.

How-to guides

How to upgrade a cluster’s Kubernetes version

This guide assumes you have a cluster of version v1.34.2 installes with my-cluster release name.

Edit the config.version and config.image fields of your values to contain a new version and new VM image for that version:

# Inside values.yaml
# ...
config:
  version: v1.35.0
  image: https://objects.infra.f6n.io/images/rocky-10.1-k8s-1.35.0-5986a60-20260102.qcow2

Applying the new values will cause a rollout of both control plane and worker nodes. Upgrade the chart to apply the new values:

helm upgrade my-cluster oci://ghcr.io/sneakybugs/cluster --version 6.0.2 --values values.yaml

Use the following command on the tenant cluster to see how many nodes have been rolled to the new version:

kubectl get nodes

Once all nodes of the tenant cluster are of the new version, the cluster is upgraded.

How to restore a cluster from a backup

This guide assumes you want to restore from a cluster called old-cluster in the example namespace.

Copy the values of the old-cluster Helm release to a file named values.yaml. Override the following values:

# Inside values.yaml
# ...
config:
  cephCSIRBD:
    existingRadosNamespace: example-old-cluster
  velero:
    storage:
      prefix: example-old-cluster
      existingObjectBucketClaim: example-old-cluster-velero
      existingObjectBucketUser: example-old-cluster-velero-backup

Install the chart with a new name. We will use new-cluster for this guide.

helm upgrade --install new-cluster oci://ghcr.io/sneakybugs/cluster --version 6.0.2 --values values.yaml

Wait for all Argo applications to become synced and healthy. Get the new cluster’s Kubeconfig:

kubectl get secret new-cluster-kubeconfig -o json | jq -r .data.value | base64 -d > tenant.conf
export KUBECONFIG="$(pwd)/tenant.conf"

Create a restore.yaml file with the following content. For the purpose of this guide we will restore from a backup called my-backup.

# Inside restore.yaml
apiVersion: velero.io/v1
kind: Restore
metadata:
  name: old-cluster-restore
spec:
  backupName: my-backup

Create the restore resource:

kubectl apply -f restore.yaml

Wait until the restore is completed. You now have a restored replica of old-cluster.

How to destroy a cluster

To destroy a cluster uninstall the chart:

helm uninstall my-cluster

Clusters with features.backups enabled the chart keeps a CephBlockPoolRadosNamespace, a ObjectBucketClaim and a CephObjectStoreUser. Use the following commands to identify resources that were kept:

kubectl get -n rook-ceph cephblockpoolradosnamespace
kubectl get -n rook-ceph objectbucketclaim
kubectl get -n rook-ceph cephobjectstoreuser

Only delete the CephBlockPoolRadosNamespace and ObjectBucketClaim IF YOU WANT TO LOSE BACKUP DATA.

Configuration reference

ParameterDescriptionDefault
nameOverrideOverride chart name.""
fullnameOverrideOverride full release name.""
argocdNamespaceNamespace to deploy Argo CD resources to.”argocd”
versions.calicoCalico version to deploy.”v3.30.3”
versions.certManagerCert Manager version to deploy.”v1.19.1”
versions.componentscluster-components chart version to deploy.”9.0.0”
versions.telemetryExporterComponentstelemetry-exporter-components chart version to deploy.”4.0.0”
versions.externalDNSExternalDNS version to deploy.”1.19.0”
versions.externalSecretsExternal Secrets version to deploy.”0.20.3”
versions.envoyGatewayEnvoy Gateway version to deploy.”1.5.4”
versions.kubeStateMetricsKube State Metrics chart version to deploy.”6.3.0”
versions.kubePrometheusStackkube-prometheus-stack chart version to deploy.”78.2.1”
versions.prometheusNodeExporterNode exporter chart version to deploy.”4.48.0”
versions.prometheusOperatorCRDsPrometheus Operator CRDs chart version to deploy.”24.0.1”
versions.openTelemetryOperatorOpenTelemetry Operator chart version to deploy.”0.97.1”
versions.veleroVelero chart version to deploy.”11.1.1”
versions.veleroPluginForAWSVelero AWS plugin version to deploy.”1.12.2”
versions.kroKro version to deploy.”0.7.1”
versions.cephCSIRBDceph-csi-rbd version to deploy.”3.15.0”
versions.metricsServerMetrics server version to deploy.”3.13.0”
versions.externalSnapshotterexternal-snapshotter version to deploy.”v8.4.0”
features.backupsEnable Velero backups when true. When enabled CephBlockPoolRadosNamespace, ObjectBucketClaim and CephObjectStoreUser are kept when deleting the chart for use when restoring backups.true
features.telemetryExporterEnable OpenTelemetry exporter when true.true
features.kroEnable Kro when true.true
features.componentsDisable cluster-components chart when false.true
features.exporterComponentsDisable telemetry-exporter-components chart when false.true
config.podSubnetPod subnet to use.”10.243.0.0/16”
config.serviceSubnetService subnet to use.”10.95.0.0/16”
config.versionKubernetes version of the cluster. Triggers control plane rollout.”v1.34.2”
config.imageNode image to use. Triggers node rollout when changed.https://objects.infra.f6n.io/images/rocky-10.1-k8s-1.34.2-20251221.qcow2
config.controlPlane.replicasControl plane node count.1
config.controlPlane.resources.storageControl plane node disk size. Triggers node rollout when changed.”16Gi”
config.controlPlane.resources.coresControl plane node core count. Triggers node rollout when changed.2
config.controlPlane.resources.memoryControl plane node RAM size. Triggers node rollout when changed.”4Gi”
config.workers.replicasWorker node count.1
config.workers.resources.storageWorker node disk size. Triggers node rollout when changed.”32Gi”
config.workers.resources.coresWorker node core count. Triggers node rollout when changed.4
config.workers.resources.memoryWorker node RAM size. Triggers node rollout when changed.”8Gi”
config.cephCSIRBD.existingRadosNamespaceWhen specified the cluster will use an existing Rados namespace instead of creating one.null
config.cephCSIRBD.rookNamespaceNamespace Rook is installed in on the management cluster.”rook-ceph”
config.cephCSIRBD.blockPoolNameCephBlockPool name to use for volumes.”ceph-blockpool”
config.cephCSIRBD.cephMonitorsCeph mon endpoints for connecting to the Ceph cluster.[“10.1.0.10:6789”]
config.imageRegistriesImage registy overrides. Require node rollout to take effect.[{“prefix”: “docker.io”, “location”: “oci.infra.f6n.io/docker”}, {“prefix”: “quay.io”, “location”: “oci.infra.f6n.io/quay”}, {“prefix”: “ghcr.io”, “location”: “oci.infra.f6n.io/ghcr”}, {“prefix”: “registry.k8s.io”, “location”: “oci.infra.f6n.io/k8s”}, {“prefix”: “oci.external-secrets.io”, “location”: “oci.infra.f6n.io/external-secrets”}]
config.gateway.listenersList of Gateway listeners. Must specify hostname, can optionally specify protocol, port and tls.[{“hostname”: “*.example.com”}]
config.otlpExporter.endpointCentralized OpenTelemetry Collector endpoint to export telemetry to.”otel.ops.f6n.io:4317”
config.externalSecrets.remoteNamespace”tenant-secrets”
config.externalSecrets.urlhttps://10.1.0.10:6443
config.externalDNSWebhook.repository”ghcr.io/sneakybugs/corewarden-externaldns-provider”
config.externalDNSWebhook.apiEndpointDNS API server endpoint.https://dns.infra.f6n.io/v1
config.externalDNSWebhook.tag”4.1.3”
config.certManagerValues for cluster-components chart certManager field.{}
config.velero.storage.rookNamespaceRook namespace to use for backup ObjectBucketClaim in the management cluster.”rook-ceph”
config.velero.storage.cephObjectStoreNameRook CephObjectStore to use for the bucket in the management cluster.”ceph-objectstore”
config.velero.storage.cephObjectBucketStorageClassNameRook StorageClass name to use for the bucket in the management cluster.”ceph-bucket”
config.velero.storage.s3UrlS3 endpoint URL for backup storage.https://objects.infra.f6n.io
config.velero.storage.prefixPrefix to store backups in the S3 bucket, defaults to : if unspecified.""
config.velero.storage.accessModeBackup location access mode, ReadWrite or ReadOnly.”ReadWrite”
config.velero.storage.existingObjectBucketClaimnull
config.velero.storage.existingObjectBucketUsernull
config.velero.backup.scheduleCron schedule to perform backups at.”0 4 * * *“
config.velero.backup.ttlTime to keep backups.”720h”
config.rbac.namespaceNamespace for role in the management cluster.”tenant-secrets”
config.rbac.rulesRole rules for the namespace in the management cluster.[{“apiGroups”: [""], “resources”: [“secrets”], “verbs”: [“get”, “list”, “watch”], “resourceNames”: [“cert-manager”, “external-dns”, “backup”]}, {“apiGroups”: [“authorization.k8s.io”], “resources”: [“selfsubjectrulesreviews”], “verbs”: [“create”]}]

Star the source on GitHub.