Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs update #266

Merged
merged 13 commits into from
Jan 9, 2025
38 changes: 12 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Ansible RKE2 (RKE Government) Playbook
---------
[![LINT](https://github.com/rancherfederal/rke2-ansible/actions/workflows/ci.yml/badge.svg)](https://github.com/rancherfederal/rke2-ansible/actions/workflows/ci.yml)

RKE2, also known as RKE Government, is Rancher's next-generation Kubernetes distribution. This Ansible playbook installs RKE2 for both the control plane and workers.
RKE2, also known as RKE Government, is Rancher's next-generation Kubernetes distribution. This Ansible playbook installs RKE2 for both the control plane and workers.

See the [docs](https://docs.rke2.io/) more information about [RKE Government](https://docs.rke2.io/).

Expand All @@ -49,20 +49,10 @@ Supported Operating Systems:

System requirements
-------------------

Deployment environment must have Ansible 2.9.0+

Server and agent nodes must have passwordless SSH access

Usage
-----

This playbook requires ansible.utils to run properly. Please see https://docs.ansible.com/ansible/latest/galaxy/user_guide.html#installing-a-collection-from-galaxy for more information about how to install this.

```
ansible-galaxy collection install -r requirements.yml
```

Create a new directory based on the `sample` directory within the `inventory` directory:

```bash
Expand Down Expand Up @@ -94,32 +84,28 @@ Start provisioning of the cluster using the following command:

```bash
ansible-playbook site.yml -i inventory/my-cluster/hosts.yml
```
```

More detailed information can be found [here](./docs/README.md)

Tarball Install/Air-Gap Install
-------------------------------
Added the neeed files to the [tarball_install](tarball_install/) directory.

Further info can be found [here](tarball_install/README.md)
Tarball Install/Air-Gap Install
-------------------------------
Air-Gap/Tarball install information can be found [here](./docs/tarball_install.md)


Kubeconfig
----------
The root user will have the `kubeconfig` and `kubectl` made available, to access your cluster login into any server node and `kubectl` will be available for use immideatly.

To get access to your **Kubernetes** cluster just

```bash
ssh ec2-user@rke2_kubernetes_api_server_host "sudo /var/lib/rancher/rke2/bin/kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes"
```

Available configurations
------------------------

Available configurations
------------------------
Variables should be set in `inventory/cluster/group_vars/rke2_agents.yml` and `inventory/cluster/group_vars/rke2_servers.yml`. See sample variables in `inventory/sample/group_vars` for reference.


Uninstall RKE2
---------------
Uninstall RKE2
---------------
Note: Uninstalling RKE2 deletes the cluster data and all of the scripts.
The offical documentation for fully uninstalling the RKE2 cluster can be found in the [RKE2 Documentation](https://docs.rke2.io/install/uninstall/).

Expand Down
2 changes: 2 additions & 0 deletions changelogs/changelog.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
---
releases: {}
252 changes: 252 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
# Table of Contents
- [Table of Contents](#table-of-contents)
- [Basic Usage](#basic-usage)
- [Cloning](#cloning)
- [Importing](#importing)
- [Defining Your Cluster](#defining-your-cluster)
- [Minimal Cluster Inventory](#minimal-cluster-inventory)
- [Structuring Your Variable Files](#structuring-your-variable-files)
- [Enabling SELinux](#enabling-selinux)
- [Enabling CIS Modes](#enabling-cis-modes)
- [Special Variables](#special-variables)
- [RKE2 Config Variables](#rke2-config-variables)
- [Defining a PSA Config](#defining-a-psa-config)
- [Example](#example)
- [Defining an Audit Policy](#defining-an-audit-policy)
- [Example](#example-1)
- [Adding Additional Cluster Manifests](#adding-additional-cluster-manifests)
- [Pre-Deploy Example](#pre-deploy-example)
- [Post-Deploy Example](#post-deploy-example)
- [rke2\_install\_version](#rke2_install_version)
- [Examples](#examples)

# Basic Usage
There are two methods for consuming this repository, one is to simply clone the repository and edit it as neccessary, the other is to import it as a collection, both options are detailed below.

> [!NOTE]
> If you are looking for airgap or tarball installation instructions, please go [here](./tarball_install.md)

## Cloning
The simplest method for using this repository (as detailed in the main README.md) is to simply clone the repository and copy the sample inventory.


## Importing
The second method for using this project is to import it as a collection in your own `requirements.yaml` as this repository does contain a `galaxy.yaml`. To import it add the following to your `galaxy.yaml`:
```yaml
collections:
- name: rancherfederal.rke2-ansible
source: git@github.com:rancherfederal/rke2-ansible.git
type: git
version: main
```
Then you can call the RKE2 role in a play like so:
```yaml
---
- name: RKE2 play
hosts: all
any_errors_fatal: True
roles:
- role: rancherfederal.rke2_ansible.rke2
```


# Defining Your Cluster
This repository is not intended to be opinionated and as a rersult it is important you to have read and understand the [RKE2 docs](https://docs.rke2.io/) before moving forward, this documentation is not intended to be an exhaustive explanation of all possible RKE2 configuration options, it is up to the end user to ensure their options are valid.


## Minimal Cluster Inventory
The most basic inventory file contains nothing more than your hosts, see below:
```yaml
---
rke2_cluster:
children:
rke2_servers:
hosts:
server0.example.com:
rke2_agents:
hosts:
agent0.example.com:
```
This is the simplest possible inventory file and will deploy the latest available version of RKE2 with only default settings.


## Structuring Your Variable Files
Configurations and variables can become lengthy annd unwieldy, as a general note of advice it is best to move variables into a `group_vars` folder.
```
./inventory
├── Cluser_A
│   ├── group_vars
│   │   ├── all.yml
│   │   ├── rke2_agents.yml
│   │   └── rke2_servers.yml
│   └── hosts.yml
└── Cluser_B
├── group_vars
│   ├── all.yml
│   ├── rke2_agents.yml
│   └── rke2_servers.yml
└── hosts.yml

5 directories, 8 files
```


## Enabling SELinux
Enabling SELinux in the playbook requires `seliux: true` be set in either the cluster, group, or host level config profiles (Please see [Special Variables](#special-variables) for more info). Though generally this should be set at the cluster and can be done like so:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the sample inventory contain an "all" example? If I wanted to disable selinux on all hosts, what file can be used to disable it and where should it be (if I'm unfamiliar with cluster, group, or host level config profiles).

Copy link
Contributor Author

@Daemonslayer2048 Daemonslayer2048 Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the sample inventory contain an "all" example?

I am not sure what you mean by this, there is an "advanced_sample_inventory" folder which contains all the configuration values used in this README file.

If I wanted to disable selinux on all hosts, what file can be used to disable it and where should it be (if I'm unfamiliar with cluster, group, or host level config profiles).

If you don't want SELinux enabled on RKE2 simply do not include it as a variable it is off by default. If you are unfamiliar with cluster, group, or host level config profiles you should see the "Special Variables" section that is linked right after they are referenced.

```yaml
---
all:
vars:
cluster_rke2_config:
selinux: true
```
For more information please see the RKE2 documentation, [here](https://docs.rke2.io/security/selinux).


## Enabling CIS Modes
Enabling the CIS tasks in the playbook requires a CIS profile be added to the ansible variables file. This can be placed in either the cluster, or group level config profiles (Please see [Special Variables](#special-variables) for more info). Below is an example, in the example the CIS profile is set at the group level, this ensures all server nodes run the CIS hardening profile tasks.
```yaml
rke2_cluster:
children:
rke2_servers:
vars:
group_rke2_config:
profile: cis
```
For more information please see the RKE2 documentation, [here](https://docs.rke2.io/security/hardening_guide).


## Special Variables
In general this repository has attempted to move away from special or "magic" variables, however some are unavoidable, the (non-exhaustive) list of variables is below:
- `all.vars.rke2_install_version`: This defines what version of RKE2 to install
- `rke2_cluster.children.rke2_servers.vars.hosts.<host>.node_labels`: Defines a list of node labels for a specific server node
- `rke2_cluster.children.rke2_agents.vars.hosts.<host>.node_labels`: Defines a list of node labels for a specific agent node


### RKE2 Config Variables
There are three levels an RKE2 config variables can be placed in, that is `cluster_rke2_config`, `group_rke2_config`, and `host_rke2_config`.
- `all.vars.cluster_rke2_config`: Defines common RKE2 config options for the whole cluster
- `rke2_cluster.children.rke2_servers.vars.group_rke2_config`: Defines common RKE2 config options for the `rke2_servers` group
- `rke2_cluster.children.rke2_agents.vars.group_rke2_config`: Defines common RKE2 config options for the `rke2_agents` group
- `rke2_cluster.children.rke2_servers.vars.hosts.<host>.host_rke2_config`: Defines a list of node labels for a specific agent node
- `rke2_cluster.children.rke2_agents.vars.hosts.<host>.host_rke2_config`: Defines a list of node labels for a specific agent node

> [!NOTE]
> Through the rest of these docs you may see references to `rke2_servers.yaml`, this is the group vars file for rke2_servers. This is functionally equivalent to `rke2_cluster.children.rke2_servers.vars`. References to `rke2_agents.yaml` is functionally equivalent to `rke2_cluster.children.rke2_agents.vars`

It is important to understand these variables here are not special in the sense that they enable or disable certain functions in the RKE2 role, with one notable exception being the `profile` key. These variables are special in the sense that they will be condensed into a single config file on each node. Each node will end up with a merged config file comprised of `cluster_rke2_config`, `group_rke2_config`, and `host_rke2_config`.

### Defining a PSA Config
In order to define a PSA config, server nodes will need to have the `rke2_pod_security_admission_config_file_path` variable defined, then the `pod-security-admission-config-file` will need to be defined in the rke2_config variable at the relevant level (please see [RKE Config Variables](#rke2-config-variables)).

#### Example
Below is an example of how this can be defined at the server group level (`rke2_cluster.children.rke2_servers.vars`):

__rke2_servers.yaml:__
```yaml
---
rke2_pod_security_admission_config_file_path: "{{ playbook_dir }}/docs/advanced_sample_inventory/files/pod-security-admission-config.yaml"
group_rke2_config:
pod-security-admission-config-file: /etc/rancher/rke2/pod-security-admission-config.yaml
```


### Defining an Audit Policy
In order to define a audit policy config, server nodes will need to have the `rke2_audit_policy_config_file_path` variable defined, then the `audit-policy-file` will need to be defined in the rke2_config variable at the relevant level (please see [RKE Config Variables](#rke2-config-variables)).

#### Example
Below is an example of how this can be defined at the server group level (`rke2_cluster.children.rke2_servers.vars`):

__rke2_servers.yaml:__
```yaml
rke2_audit_policy_config_file_path: "{{ playbook_dir }}/docs/advanced_sample_inventory/files/audit-policy.yaml"
group_rke2_config:
audit-policy-file: /etc/rancher/rke2/audit-policy.yaml
kube-apiserver-arg:
- audit-policy-file=/etc/rancher/rke2/audit-policy.yaml
- audit-log-path=/var/lib/rancher/rke2/server/logs/audit.log
```


### Adding Additional Cluster Manifests
If you have a cluster that needs extra manifests to be deployed or the cluster needs a ciritical component to be configured RKE2's "HelmChartConfig" is an available option (among others). The Ansible repository supports the use of these configuration files, simply place them in a folder and give Ansible the path to the folder, Ansible will enumarte the files and place them on the first server node.

There are two variables that control the deployment of manifests to the server nodes:
- `rke2_manifest_config_directory`
- `rke2_manifest_config_post_run_directory`

The first variable is used to deploy manifest to the server nodes before starting the RKE2 server process, this ensures critical components (like the CNI) can be configured when the RKE2 server process starts. The second, ensures applications are deployed after the RKE2 server process starts. There are examples of both below.

#### Pre-Deploy Example
The example used is configuring Cilium with the kube-proxy replacement enabled a fairly common use case:

> [!WARNING]
> If this option is used you must provide a `become` password and this must be the password for the local host running the Ansible playbook. The playbook is looking for this directory on the localhost, and will run as root. This imposes some limitations, if you are using an SSH password to login to remote systems (typical for STIG'd clusters) the `become` password must be the same for the cluster nodes AND localhost.

__rke2_servers.yaml:__
For this example to work kube proxy needs to be disabled, and the Cilium CNI needs to be enabled.
```yaml
rke2_manifest_config_directory: "{{ playbook_dir }}/docs/advanced_sample_inventory/pre-deploy-manifests/"
group_rke2_config:
# Use Cilium as the CNI
cni:
- cilium
# Cilium will replace this
disable-kube-proxy: true
```

__cilium.yaml:__
This file should be placed in the directory you intend to upload to the server node, in the example above that is `{{ playbook_dir }}/docs/advanced_sample_inventory/pre-deploy-manifests/`.
```yaml
---
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: rke2-cilium
namespace: kube-system
spec:
valuesContent: |-
kubeProxyReplacement: true
k8sServiceHost: 127.0.0.1
k8sServicePort: 6443
bpf:
masquerade: true
preallocateMaps: true
tproxy: true
bpfClockProbe: true
```

#### Post-Deploy Example
In the example below cert-manager is auto deployed after the RKE2 server process is started.
__rke2_servers.yaml:__
```yaml
rke2_manifest_config_post_run_directory: "{{ playbook_dir }}/docs/advanced_sample_inventory/post-deploy-manifests/"
```

This file should be placed in the directory you intend to upload to the server node, in the example above that is `{{ playbook_dir }}/docs/advanced_sample_inventory/pre-deploy-manifests/`.
__cert-manager.yaml__
```yaml
---
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: jetstack
namespace: kube-system
spec:
repo: https://charts.jetstack.io
chart: cert-manager
version: v1.16.2
targetNamespace: cert-manager
createNamespace: true
valuesContent: |-
crds:
enabled: true
```


### rke2_install_version
A version of RKE2 can be selected to be installed via the `all.vars.rke2_install_version` variable


# Examples
There are two examples provided in this folder, "basic_sample_inventory", and "advanced_sample_inventory". The basic example is the simplest possible example, the advanced example is all of the options explained above in one example.
Original file line number Diff line number Diff line change
@@ -1,8 +1,3 @@
# This sample list was generated from:
# https://ranchermanager.docs.rancher.com/how-to-guides/new-user-guides/authentication-permissions-and-global-configuration/psa-config-templates#exempting-required-rancher-namespaces
# For security reasons, this list should be as concise as possible
# only include active namespaces that need to be except from a restricted profile.

---
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
Expand Down
4 changes: 4 additions & 0 deletions docs/advanced_sample_inventory/group_vars/all.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
rke2_install_version: v1.29.12+rke2r1
cluster_rke2_config:
selinux: true
18 changes: 18 additions & 0 deletions docs/advanced_sample_inventory/group_vars/rke2_servers.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
rke2_pod_security_admission_config_file_path: "{{ playbook_dir }}/docs/advanced_sample_inventory/files/pod-security-admission-config.yaml"

Check warning on line 2 in docs/advanced_sample_inventory/group_vars/rke2_servers.yaml

View workflow job for this annotation

GitHub Actions / Lint for PR

2:121 [line-length] line too long (138 > 120 characters)

Check warning on line 2 in docs/advanced_sample_inventory/group_vars/rke2_servers.yaml

View workflow job for this annotation

GitHub Actions / Lint for PR

2:121 [line-length] line too long (138 > 120 characters)

Check warning on line 2 in docs/advanced_sample_inventory/group_vars/rke2_servers.yaml

View workflow job for this annotation

GitHub Actions / Lint for push

2:121 [line-length] line too long (138 > 120 characters)

Check warning on line 2 in docs/advanced_sample_inventory/group_vars/rke2_servers.yaml

View workflow job for this annotation

GitHub Actions / Lint for push

2:121 [line-length] line too long (138 > 120 characters)
rke2_audit_policy_config_file_path: "{{ playbook_dir }}/docs/advanced_sample_inventory/files/audit-policy.yaml"
rke2_manifest_config_directory: "{{ playbook_dir }}/docs/advanced_sample_inventory/pre-deploy-manifests/"
rke2_manifest_config_post_run_directory: "{{ playbook_dir }}/docs/advanced_sample_inventory/post-deploy-manifests/"

group_rke2_config:
# Use Cilium as the CNI
cni:
- cilium
# Cilium will replace this
disable-kube-proxy: true
profile: cis
pod-security-admission-config-file: /etc/rancher/rke2/pod-security-admission-config.yaml
audit-policy-file: /etc/rancher/rke2/audit-policy.yaml
kube-apiserver-arg:
- audit-policy-file=/etc/rancher/rke2/audit-policy.yaml
- audit-log-path=/var/lib/rancher/rke2/server/logs/audit.log
9 changes: 9 additions & 0 deletions docs/advanced_sample_inventory/hosts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
rke2_cluster:
children:
rke2_servers:
hosts:
server0.example.com:
rke2_agents:
hosts:
agent0.example.com:
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: jetstack
namespace: kube-system
spec:
repo: https://charts.jetstack.io
chart: cert-manager
version: v1.16.2
targetNamespace: cert-manager
createNamespace: true
valuesContent: |-
crds:
enabled: true
Loading
Loading