Skip to content

Cat-approved 😸 Home Kubernetes cluster running Talos Linux | Automated via Flux, Renovate, and GitHub Actions βš™οΈ

License

Notifications You must be signed in to change notification settings

zebernst/homelab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

My homelab k8s cluster πŸš€

... automated via Flux, Renovate and GitHub Actions πŸ€–

TalosΒ Β  KubernetesΒ Β  FluxΒ Β  Renovate

Home-InternetΒ Β  Status-PageΒ Β  Alertmanager

Age-DaysΒ Β  Uptime-DaysΒ Β  Node-CountΒ Β  Pod-CountΒ Β  CPU-UsageΒ Β  Memory-UsageΒ Β  Power-Usage


✏ Overview

This is a repository for my home infrastructure and Kubernetes cluster. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using tools like Kubernetes, Flux, Renovate and GitHub Actions.


🌱 Kubernetes

This hyper-converged cluster runs Talos Linux, an immutable and ephemeral Linux distribution tailored for Kubernetes, and is deployed on bare-metal Minisforum MS-01 mini-PCs. Currently, persistent storage is provided via Rook in order to enable resilient block-, file-, and object-storage within the cluster. A Synology NAS handles media file storage and backups, and is also available as an alternate storage location with the help of a custom fork of the official Synology CSI for workloads that should not be hyper-converged. The cluster is designed to enable a full teardown without any data loss.

πŸ”Έ Click here to see my Talos configuration.

There is a template at onedr0p/cluster-template if you want to follow along with many of the practices I use here.

Core Components

  • cert-manager: Manage SSL certificates for services in my cluster.
  • cilium: eBPF-based networking for my workloads.
  • cloudflared: Enables Cloudflare secure access to my services.
  • external-dns: Automatically syncs ingress DNS records to a DNS provider.
  • external-secrets: Managed Kubernetes secrets using 1Password Connect.
  • ingress-nginx: Kubernetes ingress controller using NGINX as a reverse proxy and load balancer.
  • rook: Distributed block, file, and object storage for stateful workloads.
  • spegel: Stateless cluster-local OCI registry mirror.
  • volsync: Backup and recovery of persistent volume claims.

GitOps

Flux monitors my kubernetes folder (see Directories below) and implements changes to my cluster based on the YAML manifests.

Flux operates by recursively searching the kubernetes/apps folder until it locates the top-level kustomization.yaml in each directory. It then applies all the resources listed in it. This kustomization.yaml typically contains a namespace resource and one or more Flux kustomizations. These Flux kustomizations usually include a HelmRelease or other application-related resources, which are then applied.

Renovate monitors my entire repository for dependency updates, automatically creating a PR when updates are found. When the relevant PRs are merged, Flux then applies the changes to my cluster.

Directories

This Git repository contains the following directories under kubernetes/.

πŸ“ kubernetes
β”œβ”€β”€ πŸ“ apps           # applications
β”œβ”€β”€ πŸ“ bootstrap      # bootstrap procedures
β”œβ”€β”€ πŸ“ components     # reusable kustomize components
└── πŸ“ flux           # core flux configuration

Cluster layout

This is a high-level look how Flux deploys my applications with dependencies. Below there are 3 Flux kustomizations cloudnative-pg, postgres-cluster, and atuin. cloudnative-pg is the first app that needs to be running and healthy before postgres-cluster and once postgres-cluster is healthy, then atuin will be deployed.

graph TD;
  id1>Kustomization: cluster] -->|Creates| id2>Kustomization: cluster-apps];
  id2>Kustomization: cluster-apps] -->|Creates| id3>Kustomization: cloudnative-pg];
  id2>Kustomization: cluster-apps] -->|Creates| id5>Kustomization: postgres-cluster]
  id2>Kustomization: cluster-apps] -->|Creates| id8>Kustomization: atuin]
  id3>Kustomization: cloudnative-pg] -->|Creates| id4[HelmRelease: cloudnative-pg];
  id5>Kustomization: postgres-cluster] -->|Depends on| id3>Kustomization: cloudnative-pg];
  id5>Kustomization: postgres-cluster] -->|Creates| id10[Postgres Cluster];
  id8>Kustomization: atuin] -->|Creates| id9(HelmRelease: atuin);
  id8>Kustomization: atuin] -->|Depends on| id5>Kustomization: postgres-cluster];
Loading

🌎 Networking & DNS

Click to see a high-level phsyical network diagram network

Apps hosted on my cluster are exposed using any combination of three different methods, depending on their use-case, security requirements, and intended audience. All three methods utilise fully encrypted HTTPS connections – TLS certificates are automatically provisioned and renewed by Cert Manager for each application.

Local Network

The first and easiest way that an app can be exposed is strictly on my local network. This is most often used for apps and services that have to do with home automation – given that every smart home device is on my local network, there is no need to expose e.g. a supporting service like MQTT any further than that.

Local deployments are accomplished by creating an Ingress of type internal, which will register a virtual IP for the service in a designated subnet (advertised via BGP) and provision a DNS record on the router with the ExternalDNS webhook provider for UniFi.

Privately Exposed (Tailscale)

The second and most common way that an app can be exposed is via Tailscale. Creating an Ingress with the tailscale class will expose the application to my Tailnet, and automagically configure DNS records. Most self-hosted apps and dashboards are exposed using this Ingress class, so that they are accessible on my personal devices at a consistent URL no matter if I'm at home or abroad.

Tailscale also serves as a Kubernetes auth proxy, which I use in conjunction with the Nautik iOS app to monitor and administer my Kubernetes cluster on-the-go.

Publicly Exposed

The final and least common way to expose an app is via cloudflared, the Cloudflare Tunnel daemon. By routing all external traffic through Cloudflare's infrastructure, I gain the benefits of their global security infrastructure (notably DDoS protection). This is generally used for webhook endpoints which require access from the wider Internet, though I do expose a select few apps for friends and family.

Creating an external Ingress will trigger using ExternalDNS to provision a CNAME DNS record on Cloudflare which points at the Cloudflare Tunnel endpoint. The tunnel routes traffic securely into my cluster, where the ingress controller further routes it to the destination Service.


☁️ Cloud Dependencies

While most of my infrastructure and workloads are self-hosted, I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about three things:

  1. Dealing with chicken/egg scenarios
  2. Critical services that need to be accessible, whether my cluster is online or not.
  3. The "hit by a bus" scenario - what happens to critical apps (e.g. Email, Password Manager, Photos, etc.) that my friends and family rely on when I'm no longer around.

Alternative solutions to the first two of these problems would be to host a Kubernetes cluster in the cloud and deploy applications like Vault, Vaultwarden, ntfy, and Gatus; however, maintaining another cluster and monitoring another group of workloads would frankly be more time and effort than I am willing to put in. (and would probably cost more or equal out to the same costs as described below)

Service Use Cost
1Password Secrets with External Secrets ~$36/yr
Cloudflare Domain/DNS ~$24/yr
Backblaze S3-compatible object storage ~$36/yr
GitHub Hosting this repository and continuous integration/deployments Free
Pushover Kubernetes Alerts and application notifications $5 OTP
UptimeRobot Monitoring internet connectivity and external facing applications Free
Healthchecks.io Dead man's switch for monitoring cron jobs Free
Total: ~$10/mo

βš™ Hardware

Click to see my rack

1B51EA7B-3517-4614-B7FC-A15943763705_1_105_c

Device Count OS Disk Data Disk RAM OS Purpose
MS-01 (i9-12900H) 3 1TB M.2 SSD 2TB M.2 SSD (Rook) 96GB Talos Linux Kubernetes
Synology DS918+ 1 - 2x14TBΒ HDD + 2x18TBΒ HDD + 2x1TBΒ SSDΒ R/WΒ Cache 16GB DSM 7 NAS/NFS/Backup
JetKVM 2 - - - - KVM
Home Assistant Yellow 1 8GB eMMC 1TB M.2 SSD 4GB HAOS Home Automation
UniFi UDM Pro 1 - - - UniFi OS Router
UniFi USW Pro 24 PoE 1 - - - UniFi OS Core Switch
Unifi USP PDU Pro 1 - - - UniFi OS PDU
CyberPower OR500LCDRM1U 1 - - - - UPS

πŸ™ Gratitude and Thanks

Huge thank-you to the folks over at the Home Operations community, especially @onedrop, @bjw-s, and @buroa – their home-ops repos have been an amazing resource to draw upon.

Be sure to check out kubesearch.dev for further ideas and reference for deploying applications on Kubernetes.


🚧 Changelog

See the latest release notes.


βš– License

See LICENSE.

About

Cat-approved 😸 Home Kubernetes cluster running Talos Linux | Automated via Flux, Renovate, and GitHub Actions βš™οΈ

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •  

Languages