From 6bda8e3a8b7a2d817e3ecee9e275cbafca2bd88d Mon Sep 17 00:00:00 2001 From: Brice Fernandes Date: Thu, 21 Jan 2021 11:12:29 +0000 Subject: [PATCH 01/19] Define the GitOps Principles This commit proposes a succint definition of the GitOps principles with additional calrification. Co-authored-by: Scott Rigby Signed-off-by: Brice Fernandes --- GLOSSARY.md | 21 +++++ PRINCIPLES.md | 254 ++++++++++++++++++++++++++++++++++++++++++++++++++ RATIONALE.md | 49 ++++++++++ 3 files changed, 324 insertions(+) create mode 100644 GLOSSARY.md create mode 100644 PRINCIPLES.md create mode 100644 RATIONALE.md diff --git a/GLOSSARY.md b/GLOSSARY.md new file mode 100644 index 0000000..8384b84 --- /dev/null +++ b/GLOSSARY.md @@ -0,0 +1,21 @@ +# Glossary + +- **Desired state** + + The aggregate of all configuration data for a system form its "Desired State". The "Desired State" of a system is defined as data sufficient to recreate the system from nothing so that instances of the system are behaviorally indinstinguishable. + +- **State Store** + + A system for storing versioned, immutable Desired States that provides access control and auditing on the chnages to the Desired State. Git may be configured as a State Store, but special precautions must be taken. + +- **Reconciliation** + + A continuously running attempt to reconcile _Definition_ of desired state with the current state. + +- **Software System** + + One or more Runtime environments consisting of resources under management + In each Runtime, management Agents to act on resources according to security policies. + One or more software Repositories for storing deployable artifacts that may be loaded into the runtime environments, eg. configuration files, code, binaries and packages. + One or more Administrators who are responsible for operating the runtime environments ie. installing, starting, stopping and updating software, code, configuration, etc. + A set of policies controlling access and management of repos, deployment, runtimes. diff --git a/PRINCIPLES.md b/PRINCIPLES.md new file mode 100644 index 0000000..98646cf --- /dev/null +++ b/PRINCIPLES.md @@ -0,0 +1,254 @@ + +# GitOps Principles v0.1.0 + +## ⚠️ THIS DOCUMENT IS A WORK IN PROGRESS AND SUBJECT TO PUBLIC REVIEW BEFORE PUBLICATION ⚠️ + +This document provides a concrete definition of GitOps principles. + +The GitOps Principles are vendor and implementation neutral, and aim to provide a common understanding of GitOps systems, to enable a common framework of understanding beyond individual opinion. A second aim is to encourage innovation by clarifying the technical outcomes rather than the code, tests, or organizational elements needed to achieve them. + +## Table of Content + + +- [Summary](#summary) +- [Introduction](#introduction) +- [Scope](#scope) +- [The GitOps Principles](#the-gitops-principles) + - [1. Declarative configuration](#1-declarative-configuration) + - [What is a system's Desired State?](#what-is-a-systems-desired-state) + - [Why must the Desired State be declarative data?](#why-must-the-desired-state-be-declarative-data) + - [Why is human readability required?](#why-is-human-readability-required) + - [How much of a system must be declared?](#how-much-of-a-system-must-be-declared) + - [2. Immutable configuration versions](#2-immutable-configuration-versions) + - [What forms a version?](#what-forms-a-version) + - [3. Continuous state reconciliation](#3-continuous-state-reconciliation) + - [4. Operations through declaration](#4-operations-through-declaration) +- [See Also](#see-also) + +## Summary + +GitOps is a set of principles for operating and managing software systems. + +When using GitOps, the desired state of a system or subsystem is defined declaratively as versioned, immutable data, and the running system's configuration is continuously derived from this data. + +GitOps principles were derived from modern software operations but are rooted in pre-existing and widely adopted best practices. These principles are: + +1. [**The principle of declarative configuration**](#1-declarative-configuration) + + A system managed by GitOps must have its _Desired State_ expressed declaritively as data. + This declarative data must be in a format writable and readable by both humans and software. + +2. [**The principle of immutable configuration versions**](#2-immutable-configuration-versions) + + _Desired State_ is stored in a way that supports versioning, immutability of versions, and retains a complete version history. + We call systems that store desired state in this way _State Stores_. + +3. [**The principle of continuous state reconciliation**](#3-continuous-state-reconciliation) + + Software agents continuously, and automatically, compare a system's _Actual State_ to its _Desired State_. + If the actual and desired states differ, automated actions are immediately attempted to reconcile them. + These differences could be due to the actual state drifting from the desired state, or the desired state changing intentionally. + +4. [**The principle of operations through declaration**](#4-operations-through-declaration) + + When wishing to operate on a software system, a human or software agent will not interact with the running system and modify it directly. + Instead, the agent will create a new declarative version of the desired state in the state store. + +## Introduction + +The software systems that we manage vary widely; from battery powered embedded systems driven by microcontrollers, to globally distributed applications with millions of users. +Most systems are composed of subsystems themselves made of software. +It is impossible to comprehensively outline all the specific practices for managing such a variety of systems. + +However, despite the myriad differences, several important principles emerge that simplify the task of reliably managing and operating _all_ software systems at scale. GitOps attempts to capture some of these principles in a coherent framework. +The GitOps principles can be applied to managing entire software systems or applied only to parts of larger systems. + +See also the [Rationale for GitOps](RATIONALE.md). + +## Scope + +GitOps concerns the verifiable behaviour of computer systems and their interfaces. +Specifically, GitOps is _not_ about human processes, and is not intended as a model for judging human organisational designs and operational practices. + +The GitOps principles are to be used as guiding principles in the development of modern software and system operations. They do not form a concrete specification. + +The GitOps principles are a direction, not a destination. They should be applied pragmatically. For example, whilst desirable to apply them strictly to an entire software systems, they can also be applied loosely to selected parts of larger systems as part of a progressive adoption. + +## The GitOps Principles + +### 1. Declarative configuration + +
+A system managed by GitOps must have its Desired State expressed declaritively as data. +This declarative data must be in a format writable and readable by both humans and software. +
+ +#### What is a system's Desired State? + +_Configuration_ is a common feature of most software systems. +By "Configuration", we mean _data that defines how a system or subsystem will behave and operate_. +This configuration data is distinct and separate from the data a system will process. + +For example, the same web server code may be running on thousands of different servers managed by hundreds of different companies. +The behaviour of an individual webserver will differ based on how it is configured. +Configuration data is typically in the form of files or arguments to a computer program, but some systems may also currently use configuration databases or remote configuration services. Configuration also includes data about what version of code a software system should run, so software version information is also considered configuration. + +Together, the aggregate of all configuration data for a system form its "Desired State". The "Desired State" of a system is defined as data sufficient to recreate the system from nothing so that instances of the system are behaviorally indinstinguishable. + +#### Why must the Desired State be declarative data? + +This is a subtle but important point. +In deference to the work done by the Infrastructure as Code community (IaC), we believe that this was the intent of that movement to begin with. +However, we have in practice seen a misunderstanding in this area, and many implementations have considered imperative scripts or programatic definitions for provisioning infrastructure to be a sufficient implementation of Infrastructure as Code. We disagree. + +We make the distinction explicitly for two important reasons: + +Firstly, it forces a separation of concerns between the Desired State (_what_ a system is) and _how_ the system is made to reach that state. +This modular approach enables the implementation details of operations (the _how_) to be separated and iterated on independently from the system configuration (the _what_). +It also enables different tools and implementations to use the same Desired State declaration and interoperate against a common data language. +These modular systems are more flexible. +For example, a Python program could verify a configuration file, while a C++ executable actually implements the declaration into a running system. Encoding the Desired State with a programming language ties the implementation to it, or forces other components to create a fully featured interpreter. + +Secondly, verifying the correctness and self-consistency of data is significantly less complex than verifying the correctness of a program's behaviour, which is fundamentally undecidable. +Verifying that a set of declarations is correct, however, _is_ decidable, even if sometimes computationally expensive. +As a general rule, this implies that the data language used to defined the Desired State should have no control flow structure, and consist exclusively of referentially transparent expressions without side effects. +In other words, a human-readable data-serialization format. + +It is also preferrable to use a widely supported language to define the Desired State for the sake of interoperability. For example, YAML, XML, JSON and TOML all have broad support and well defined specifications, although there are many other suitable candidates. + +#### Why is human readability required? + +Operationally, it has been proven time and time again that the canonical Desired State of a system should be human-readable and writable. + +The Desired State encodes the _intent_ of a system. For example, where internet traffic should be routed. The collection of decisions about a system captured in its Desired State are of particular interest to its human operators. They must be able to directly describe their intent about the state of a system; and similarly, must be able to decipher the intended state of a system from its configuration. + +This principles not only capture a requirement about the format of the configuration, but also the qualitative readability of configuration _in practice_. For example, a simple YAML or XML file can be easily read, understood and modified by human operators, but experience has taught us that these formats also allow the creation of complex self-referential documents beyond the ability of most humans to interpret. Such complex document would violate this principles, even though the formats are, on the surface, human-readable. + +This relative readability also implies that the users are relevant when evaluating whether a system follows this principle. For example, a Desired State defined with a rich grammar of S-Expressions would suite a team of developers with a background in functional programming, but would violate this principle if the majority of the team found the format incomprehensible. + +Having a human-readable Desired State does not in any way preclude the use of rich tooling or graphical interfaces that facilitate Desired State generation and interpretation. It only require that the canonical source of truth be human readable and writable plain text. + +#### How much of a system must be declared? + +Ideally, all of it; and the entire system can be recreated exclusively from its Desired State. + +That said, the definition of a "computer systems" can be quite broad and strays into human processes and systems as well. +For example, is a company's sales process a "computer system"? +Likely in parts. +Should GitOps be applied to the entire sales process including the human processes (without prejudice as to what those processes are)? +Whilst such a vision is compelling, a more pragmatic approach is preferrable, otherwise we risk trying to boil the ocean. + +Instead, we should focus on subsystems where the Desired State is well defined, implement the GitOps principles there, and grow out to capture more systems from that initial subsystem. + +### 2. Immutable configuration versions + +
+Desired State is stored in a way that supports versioning, immutability of versions, and retains a complete version history. +We call systems that store Desired State in this way State Stores. +
+ +#### What forms a version? + +A version is the Desired State for a system as a whole. It is the canonical form of what we desire the system to be at a point in time. + +It is insufficient to version part of the Desired State or to version these parts in separate State Stores. Real software systems often have overarching behaviour that is the result of coupling between components. If the Desired State of these components were to change independently, it would be difficult to map a change in observed behaviour of our system to a single change in Desired State. Being able to make this 1:1 mapping is operationally benefitial, as we can then map behavioural issues of our system directly to the changes that occured. The utility of having the entire system defined in a single canonical location grows in proportion to the complexity and internal coupling of the system. A web of references to configuration data located in different locations is undesirable, as it makes understanding the desired state particularly difficult. + +Versions should be uniquely named. This need not be a semantically meanigful name. It is sufficient that each new version is attributed a name that identifies it uniquely. Once a new version has been created, it should be immutable. By this we mena that it should be impossible to modify the relationship between a version's unique name and its value of the Desired State. + +All versions, except to very first, should also have reference a predecessor or parent, which is another uniquely named version. This enables us to retain a history of the changes. + + + +### 3. Continuous state reconciliation + +
+Software agents continuously, and automatically, compare a system's Actual State to its Desired State. +If the actual and desired states differ, automated actions are immediately attempted to reconcile them. +These differences could be due to the actual state drifting from the desired state, or the desired state changing intentionally. +
+ + + + +### 4. Operations through declaration + +
+When wishing to operate on a software system, a human or software agent will not interact with the running system and modify it directly. +Instead, the agent will create a new declarative version of the desired state in the state store. +
+ + + + +## See Also + +- [The GitOps Glossary](GLOSSARY.md) \ No newline at end of file diff --git a/RATIONALE.md b/RATIONALE.md new file mode 100644 index 0000000..d1bc5a0 --- /dev/null +++ b/RATIONALE.md @@ -0,0 +1,49 @@ +# Rationale for GitOps + +Currently, many software system's desired state is not defined separately from the running system. +When we desire the behaviour of a software system to change, we modify the system's configuration directly, either through human action, or by running scripts that take a set of predetermined action on the system. + +This leads to several serious issues: + +- **Detecting drift from desired state** + + If the desired state of a system is not explicitly defined, it is impossible to verify if the system in a correct state. The state of a running system itself does not provide sufficent information to determine its correctness. + + Consider logging into an administration console and seeing that 28 machines are healthily running. + Is this good? Is this bad? That very much depends on what the desired number of machines is. + For example, these could be test machines that should have been deleted and are now incurring a significant cost for no reasons, or all that remains of a 100 machine datacenter. + We could consult the documentation, or expect the human operator to know, but by the time this occurs, our system has been in an incorrect state for a significant period of time. _The validity of the state of our system is not automatically verifiable_. + + This problem leads to the conclusion that the canonical source of truth for desired state cannot be the actual state. + Making the desired state of a system explicit prevents this problem. + +- **Recovering from transitional states** + + If we have no record of the desired state of our system, how can we recover from failures that occur in the transition between states? + Such transitions are very common lifecycle events, such as upgrades, new features being released or scaling our resources. These are where the majority of hard software defects and transient errors occur. + Not only are such failures extremely common, but their likelihood grows algebraically with the number of components in our system. + + When these failures occur, they leave our system in an indeterminate transitional state. Going back to a known good state or forward to a new desired states is difficult, as we don't have a record of what these should be, only lists of instructions + + We are left with the option of backing up our system state before a change so that we can restore a known good state if something goes wrong. + This doesn't solve the problem of how to change our system to move to a new desired state. The best we can do is restore and retry. + This becomes extremely common in complex distributed systems, where transient failures are common. + Such an approach is painful in practice and leads to an aversion to changing the system's state. + + This problem can be solved by having software agents that continuously converge the system towards a well-defined state. + +- **Controlling and auditing actors, access and actions** + + In most cases, we not only require the ability to change a running system safely, but we must also record what was changed, when, by whom, and why and enforce rules about which changes we allow. + If our systems are changed through direct access, the surface area of the interface to control and monitor can quickly become overwhelmingly large. + + Access control at different levels, from the network to the application layer must be controlled and audited. A coherent set of access control rules must be applied across varied systems configured in completely different, and sometimes incompatible, ways. + An audit trail may or may not be required for regulatory or governance reasons, but it is such a common requirement of managing software systems that it must be also be addressed. + + Having a single source of truth regarding the desired state of our system becomes a ledger of transactions between states and a single point of operation. This leads to a natural place to enforce rules regarding access and to audit changes. + + Furthermore, by removing direct access, the credentials used to manage a system can be constrained to exist only within the security boundary of the system itself, which can poll its desired state. This greatly reduces the security surface area. + +These are only a small sample of the issues that arise if we do not define the desired state of our software systems explicitly and declaratively. We believe GitOps is a solution to these many issues that plague software operations at scale. + +Since these issues occur so universally when operating software systems, we also believe that GitOps is fundamentally agnostic about specific tools. that is, the GitOps principles are universally applicable, and independent of any particular tool, solution or practice, including Git itself, after which they are named. From b696c3e63eafc7322f81d48e698ad5d38f8888fe Mon Sep 17 00:00:00 2001 From: Brice Fernandes Date: Wed, 17 Feb 2021 14:01:52 +0000 Subject: [PATCH 02/19] Apply suggestions from code review Thanks @tonit! Good catches all. Co-authored-by: Toni Menzel Co-authored-by: Leonardo Murillo Signed-off-by: Scott Rigby --- GLOSSARY.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/GLOSSARY.md b/GLOSSARY.md index 8384b84..a261e18 100644 --- a/GLOSSARY.md +++ b/GLOSSARY.md @@ -1,21 +1,21 @@ # Glossary -- **Desired state** +- **Desired State** - The aggregate of all configuration data for a system form its "Desired State". The "Desired State" of a system is defined as data sufficient to recreate the system from nothing so that instances of the system are behaviorally indinstinguishable. + The aggregate of all configuration data for a system form its "Desired State". The "Desired State" of a system is defined as data sufficient to recreate the system so that instances of the system are behaviourally indistinguishable. - **State Store** - A system for storing versioned, immutable Desired States that provides access control and auditing on the chnages to the Desired State. Git may be configured as a State Store, but special precautions must be taken. + A system for storing versioned, immutable Desired States that provides access control and auditing on the changes to the Desired State. Git may be configured as a State Store, but special precautions must be taken. - **Reconciliation** - A continuously running attempt to reconcile _Definition_ of desired state with the current state. + The process by which the current state of a system is compared against and made consistent with the system's desired state as declared in the state store - **Software System** - One or more Runtime environments consisting of resources under management + One or more Runtime environments consisting of resources under management. In each Runtime, management Agents to act on resources according to security policies. One or more software Repositories for storing deployable artifacts that may be loaded into the runtime environments, eg. configuration files, code, binaries and packages. One or more Administrators who are responsible for operating the runtime environments ie. installing, starting, stopping and updating software, code, configuration, etc. - A set of policies controlling access and management of repos, deployment, runtimes. + A set of policies controlling access and management of repositories, deployments, runtimes. From 7ecd4b9fe6db0a08e383bbd05c29fe787fad98f6 Mon Sep 17 00:00:00 2001 From: Brice Fernandes Date: Wed, 17 Feb 2021 14:22:19 +0000 Subject: [PATCH 03/19] Apply suggestions from code review Thanks @murillodigital! Co-authored-by: Leonardo Murillo Signed-off-by: Scott Rigby --- RATIONALE.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/RATIONALE.md b/RATIONALE.md index d1bc5a0..ca091e2 100644 --- a/RATIONALE.md +++ b/RATIONALE.md @@ -7,7 +7,7 @@ This leads to several serious issues: - **Detecting drift from desired state** - If the desired state of a system is not explicitly defined, it is impossible to verify if the system in a correct state. The state of a running system itself does not provide sufficent information to determine its correctness. + If the desired state of a system is not explicitly defined, it is impossible to verify if the system in a correct state. The state of a running system itself does not provide sufficient information to determine its correctness. Consider logging into an administration console and seeing that 28 machines are healthily running. Is this good? Is this bad? That very much depends on what the desired number of machines is. @@ -27,7 +27,7 @@ This leads to several serious issues: We are left with the option of backing up our system state before a change so that we can restore a known good state if something goes wrong. This doesn't solve the problem of how to change our system to move to a new desired state. The best we can do is restore and retry. - This becomes extremely common in complex distributed systems, where transient failures are common. + This becomes extremely common in complex distributed systems, where transient failures are usual. Such an approach is painful in practice and leads to an aversion to changing the system's state. This problem can be solved by having software agents that continuously converge the system towards a well-defined state. @@ -37,7 +37,7 @@ This leads to several serious issues: In most cases, we not only require the ability to change a running system safely, but we must also record what was changed, when, by whom, and why and enforce rules about which changes we allow. If our systems are changed through direct access, the surface area of the interface to control and monitor can quickly become overwhelmingly large. - Access control at different levels, from the network to the application layer must be controlled and audited. A coherent set of access control rules must be applied across varied systems configured in completely different, and sometimes incompatible, ways. + Access control at different levels, from the network to the application layer must be controlled and audited. A coherent set of access control rules must be applied across varied systems configured in completely different, and sometimes incompatible ways. An audit trail may or may not be required for regulatory or governance reasons, but it is such a common requirement of managing software systems that it must be also be addressed. Having a single source of truth regarding the desired state of our system becomes a ledger of transactions between states and a single point of operation. This leads to a natural place to enforce rules regarding access and to audit changes. From 7d375706d49ea1c1ad2127523ac8b59e512a61d8 Mon Sep 17 00:00:00 2001 From: Brice Fernandes Date: Wed, 17 Feb 2021 15:42:03 +0000 Subject: [PATCH 04/19] Apply suggestions from code review Thank you! Co-authored-by: Moshe Immerman Co-authored-by: Brian Fox Co-authored-by: Leonardo Murillo Signed-off-by: Scott Rigby --- PRINCIPLES.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/PRINCIPLES.md b/PRINCIPLES.md index 98646cf..4a00cfd 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -86,7 +86,7 @@ This declarative data must be in a format writable and readable by both humans a #### What is a system's Desired State? _Configuration_ is a common feature of most software systems. -By "Configuration", we mean _data that defines how a system or subsystem will behave and operate_. +By "Configuration", we mean _data that defines how a system or subsystem should behave_. This configuration data is distinct and separate from the data a system will process. For example, the same web server code may be running on thousands of different servers managed by hundreds of different companies. @@ -124,7 +124,7 @@ The Desired State encodes the _intent_ of a system. For example, where internet This principles not only capture a requirement about the format of the configuration, but also the qualitative readability of configuration _in practice_. For example, a simple YAML or XML file can be easily read, understood and modified by human operators, but experience has taught us that these formats also allow the creation of complex self-referential documents beyond the ability of most humans to interpret. Such complex document would violate this principles, even though the formats are, on the surface, human-readable. -This relative readability also implies that the users are relevant when evaluating whether a system follows this principle. For example, a Desired State defined with a rich grammar of S-Expressions would suite a team of developers with a background in functional programming, but would violate this principle if the majority of the team found the format incomprehensible. +This relative readability also implies that the users are relevant when evaluating whether a system follows this principle. For example, the Desired State defined with a rich grammar of S-Expressions would suit a team of developers with a background in functional programming but would violate this principle if the majority of the team found the format incomprehensible. Having a human-readable Desired State does not in any way preclude the use of rich tooling or graphical interfaces that facilitate Desired State generation and interpretation. It only require that the canonical source of truth be human readable and writable plain text. @@ -151,11 +151,11 @@ We call systems that store Desired State in this way State Stores. A version is the Desired State for a system as a whole. It is the canonical form of what we desire the system to be at a point in time. -It is insufficient to version part of the Desired State or to version these parts in separate State Stores. Real software systems often have overarching behaviour that is the result of coupling between components. If the Desired State of these components were to change independently, it would be difficult to map a change in observed behaviour of our system to a single change in Desired State. Being able to make this 1:1 mapping is operationally benefitial, as we can then map behavioural issues of our system directly to the changes that occured. The utility of having the entire system defined in a single canonical location grows in proportion to the complexity and internal coupling of the system. A web of references to configuration data located in different locations is undesirable, as it makes understanding the desired state particularly difficult. +It is insufficient to version part of the Desired State or to version these parts in separate State Stores. Real software systems often have overarching behaviour that is the result of coupling between components. If the Desired State of these components were to change independently, it would be difficult to map a change in observed behaviour of our system to a single change in Desired State. Being able to make this 1:1 mapping is operationally beneficial, as we can then map behavioural issues of our system directly to the changes that occured. The utility of having the entire system defined in a single canonical location grows in proportion to the complexity and internal coupling of the system. A web of references to configuration data located in different locations is undesirable, as it makes understanding the desired state particularly difficult. -Versions should be uniquely named. This need not be a semantically meanigful name. It is sufficient that each new version is attributed a name that identifies it uniquely. Once a new version has been created, it should be immutable. By this we mena that it should be impossible to modify the relationship between a version's unique name and its value of the Desired State. +Versions should be uniquely named. This need not be a semantically meaningful name. It is sufficient that each new version is attributed a name that identifies it uniquely. Once a new version has been created, it should be immutable. By this we mean that it should be impossible to modify the relationship between a version's unique name and its value of the Desired State. -All versions, except to very first, should also have reference a predecessor or parent, which is another uniquely named version. This enables us to retain a history of the changes. +All but the very first version should reference a predecessor or parent, which is another uniquely named version. This enables us to retain a history of the changes. @@ -133,11 +135,10 @@ Having a human-readable Desired State does not in any way preclude the use of ri Ideally, all of it; and the entire system can be recreated exclusively from its Desired State. -That said, the definition of a "computer systems" can be quite broad and strays into human processes and systems as well. -For example, is a company's sales process a "computer system"? -Likely in parts. -Should GitOps be applied to the entire sales process including the human processes (without prejudice as to what those processes are)? -Whilst such a vision is compelling, a more pragmatic approach is preferrable, otherwise we risk trying to boil the ocean. +The definition of a _system_ can be quite broad, and may incorporate human as well as programmatic processes. +For example, is a company's sales process a system? +Should GitOps be applied to it? +Although a vision in which the GitOps practices are applied generally to all processes, human or otherwise is compelling, a more pragmatic approach is preferable, to avoid the risk of attempting to "boil the ocean". Instead, we should focus on subsystems where the Desired State is well defined, implement the GitOps principles there, and grow out to capture more systems from that initial subsystem. @@ -152,7 +153,12 @@ We call systems that store Desired State in this way State Stores. A version is the Desired State for a system as a whole. It is the canonical form of what we desire the system to be at a point in time. -It is insufficient to version part of the Desired State or to version these parts in separate State Stores. Real software systems often have overarching behaviour that is the result of coupling between components. If the Desired State of these components were to change independently, it would be difficult to map a change in observed behaviour of our system to a single change in Desired State. Being able to make this 1:1 mapping is operationally beneficial, as we can then map behavioural issues of our system directly to the changes that occured. The utility of having the entire system defined in a single canonical location grows in proportion to the complexity and internal coupling of the system. A web of references to configuration data located in different locations is undesirable, as it makes understanding the desired state particularly difficult. +It is insufficient to version part of the Desired State or to version these parts in separate State Stores. +In practice, software systems often have overarching behaviour that is the result of coupling between components. +If the Desired State of these components were to change independently, it would be difficult to map a change in observed behaviour of our system to a single change in Desired State. +Being able to make this 1:1 mapping is operationally beneficial, as we can then map behavioural issues of our system directly to the changes that occured. +The utility of having the entire system defined in a single canonical location grows in proportion to the complexity and internal coupling of the system. +A web of references to configuration data located in different locations is undesirable, as it makes understanding the desired state particularly difficult. Versions should be uniquely named. This need not be a semantically meaningful name. It is sufficient that each new version is attributed a name that identifies it uniquely. Once a new version has been created, it should be immutable. By this we mean that it should be impossible to modify the relationship between a version's unique name and its value of the Desired State. From 47a1167595617d3bd497b177d82e19ffe72516b5 Mon Sep 17 00:00:00 2001 From: Brice Fernandes Date: Wed, 17 Feb 2021 15:58:00 +0000 Subject: [PATCH 10/19] Fix grammar and spelling Signed-off-by: Brice Fernandes --- PRINCIPLES.md | 2 +- RATIONALE.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/PRINCIPLES.md b/PRINCIPLES.md index 8a23c7e..2b8d141 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -7,7 +7,7 @@ This document provides a concrete definition of GitOps principles. The GitOps principles governs how humans and technical systems should interact in order to achieve the desired operational outcomes of repeatability, auditability, and visibility. -The GitOps Principles are vendor and implementation neutral, they aim to provide a common framework of understanding regarding software operating softwrae systems. +The GitOps Principles are vendor and implementation neutral, they aim to provide a common framework of understanding regarding software operations. ## Table of Content diff --git a/RATIONALE.md b/RATIONALE.md index a71028b..92a8395 100644 --- a/RATIONALE.md +++ b/RATIONALE.md @@ -21,7 +21,7 @@ This leads to several serious issues: If we have no record of the desired state of our system, how can we recover from failures that occur in the transition between states? Such transitions are very common lifecycle events, such as upgrades, new features being released or scaling our resources. These are where the majority of hard software defects and transient errors occur. - Not only are such failures extremely common, but their likelyhood grows quadratically with the number of components in our system, as they scale not with the number of components, but the number of connections between components, which can also fail independently of components. + Not only are such failures extremely common, but their likelihood grows quadratically with the number of components in our system, as failure may not only occur in each component, but also in the connections between components, which can fail independently. When these failures occur, they leave our system in an indeterminate transitional state. Going back to a known good state or forward to a new desired states is difficult, as we don't have a record of what these should be, only lists of instructions From 1a221534c8dc46d71a69347451b2a5f61043828c Mon Sep 17 00:00:00 2001 From: Brice Fernandes Date: Wed, 17 Feb 2021 16:06:01 +0000 Subject: [PATCH 11/19] Improve readability of principle 1 Signed-off-by: Brice Fernandes --- PRINCIPLES.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/PRINCIPLES.md b/PRINCIPLES.md index 2b8d141..5a7227a 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -37,8 +37,7 @@ GitOps principles were derived from modern software operations but are rooted in 1. [**The principle of declarative configuration**](#1-declarative-configuration) - A system managed by GitOps must have its _Desired State_ expressed declaritively as data. - This declarative data must be in a format writable and readable by both humans and software. + A system managed by GitOps must have its _Desired State_ expressed declaritively as data in a format writable and readable by both humans and machines. 2. [**The principle of immutable configuration versions**](#2-immutable-configuration-versions) @@ -82,8 +81,7 @@ The GitOps principles are a direction, not a destination. They should be applied ### 1. Declarative configuration
-A system managed by GitOps must have its Desired State expressed declaritively as data. -This declarative data must be in a format writable and readable by both humans and software. +A system managed by GitOps must have its _Desired State_ expressed declaritively as data in a format writable and readable by both humans and machines.
#### What is a system's Desired State? From 2bd8e0ab010fc3a27bdaea5a902b42be6429dbca Mon Sep 17 00:00:00 2001 From: Brice Fernandes Date: Wed, 17 Feb 2021 16:35:16 +0000 Subject: [PATCH 12/19] Add references to Unix and Pragmatic Programmer for plain text Signed-off-by: Brice Fernandes --- PRINCIPLES.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PRINCIPLES.md b/PRINCIPLES.md index 5a7227a..b8cf946 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -119,7 +119,7 @@ It is also preferrable to use a widely supported language to define the Desired #### Why is human readability required? -Operationally, it has been proven time and time again that the canonical Desired State of a system should be human-readable and writable. +Operationally, it has been proven time and time again that the canonical Desired State of a system should be human-readable and writable. (See _The UNIX Philosophy (1995) by Mike Gancarz_ and _The Pragmatic Programmer (2000) Section 14 "The Power of Plain Text" by Andrew Hunt and David Thomas_) The Desired State encodes the _intent_ of a system. For example, where internet traffic should be routed. The collection of decisions about a system captured in its Desired State are of particular interest to its human operators. They must be able to directly describe their intent about the state of a system; and similarly, must be able to decipher the intended state of a system from its configuration. From 0c9c513b489ad6c034a9ce6540897ef0b1d4b539 Mon Sep 17 00:00:00 2001 From: Brice Fernandes Date: Wed, 17 Feb 2021 17:02:16 +0000 Subject: [PATCH 13/19] Improve scope definition based on feedback Signed-off-by: Brice Fernandes --- PRINCIPLES.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/PRINCIPLES.md b/PRINCIPLES.md index b8cf946..1390f71 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -56,25 +56,27 @@ GitOps principles were derived from modern software operations but are rooted in ## Introduction -The systems that we manage vary widely; from battery-powered devices driven by microcontrollers to globally distributed systems with millions and sometimes billions of users. +The systems that we manage vary widely; from battery-powered devices driven by microcontrollers to globally distributed systems with millions and sometimes billions of users. -What is common across all systems is the need for human operators to make changes to the running state, it is during change that systems are most at risk of failure. +What is common across all systems is the need for human operators to make changes to the running state, it is during change that systems are most at risk of failure. GitOps is a framework in which all change is applied to a system in a consistent mechanism that both reduces risk and improves the situational awareness of human operators working inside the system. - GitOps can be applied to managing large systems or applied only to the smaller sub-system. See also the [Rationale for GitOps](RATIONALE.md). ## Scope -GitOps concerns the verifiable behaviour of computer systems and their interfaces. -Specifically, GitOps is _not_ about human processes, and is not intended as a model for judging human organisational designs and operational practices. +GitOps concerns the interaction between humans and technical systems, and between technical systems. +GitOps is not concerned with processes of human decision making or organisation, only how human decisions about technical systems are recorded and applied. +It is a structured process through which technical systems can be modified reliably. + +GitOps is _not_ intended as a model for judging human organisational designs. The GitOps principles are to be used as guiding principles in the development of modern software and system operations. They do not form a concrete specification. -The GitOps principles are a direction, not a destination. They should be applied pragmatically. For example, whilst desirable to apply them strictly to an entire software systems, they can also be applied loosely to selected parts of larger systems as part of a progressive adoption. +The GitOps principles are a _direction_, **not** a _destination_. They should be applied pragmatically. For example, whilst desirable to apply them strictly to an entire systems, they can also be applied selectively to sub-systems as part of a progressive adoption. ## The GitOps Principles From 8140f028c323596d8eba098c384329e6a6b228de Mon Sep 17 00:00:00 2001 From: Brice Fernandes Date: Wed, 17 Feb 2021 17:09:35 +0000 Subject: [PATCH 14/19] Remove ambiguous statement Signed-off-by: Brice Fernandes --- PRINCIPLES.md | 1 - 1 file changed, 1 deletion(-) diff --git a/PRINCIPLES.md b/PRINCIPLES.md index 1390f71..d8e1f8e 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -90,7 +90,6 @@ A system managed by GitOps must have its _Desired State_ expressed declaritively _Configuration_ is a common feature of most software systems. By "Configuration", we mean _data that defines how a system or subsystem should behave_. -This configuration data is distinct and separate from the data a system will process. For example, the same web server code may be running on thousands of different servers managed by hundreds of different companies. The behaviour of an individual webserver will differ based on how it is configured. From 092bb99786b5d7ef65f2b6b1630657a0ba954392 Mon Sep 17 00:00:00 2001 From: Scott Rigby Date: Wed, 12 May 2021 14:23:13 -0400 Subject: [PATCH 15/19] Edits with broad consensus by GitOps WG Principles Comittee Mar 29-May 05, 2021 From https://hackmd.io/arwvV8NUQX683uBM3HzyNQ?both. Minus the notes that were not intended to be added to the PR. Signed-off-by: Scott Rigby Co-authored-by: Andrew Block Co-authored-by: Shoubhik Bose Co-authored-by: Kevin Bowersox Co-authored-by: Jesse Butler Co-authored-by: Lloyd Chang Co-authored-by: William Chia Co-authored-by: Cornelia Davis Co-authored-by: Brice Fernandes Co-authored-by: Robert A Ficcaglia Co-authored-by: Brian Fox Co-authored-by: Christian Hernandez Co-authored-by: Moshe Immerman Co-authored-by: Timothy Lin Co-authored-by: Toni Menzel Co-authored-by: Leonardo Murillo Co-authored-by: John Pitman Co-authored-by: Scott Rigby Co-authored-by: Chris Sanders Co-authored-by: Carlos Santana Co-authored-by: Schlomo Schapiro Co-authored-by: Lothar Schulz Co-authored-by: Regina Scott Co-authored-by: Ishita Sequeira Co-authored-by: Roberth Strand Co-authored-by: Sean Sundberg --- GLOSSARY.md | 21 ---- PRINCIPLES.md | 259 ++++++-------------------------------------------- RATIONALE.md | 49 ---------- 3 files changed, 31 insertions(+), 298 deletions(-) delete mode 100644 GLOSSARY.md delete mode 100644 RATIONALE.md diff --git a/GLOSSARY.md b/GLOSSARY.md deleted file mode 100644 index 10db1bb..0000000 --- a/GLOSSARY.md +++ /dev/null @@ -1,21 +0,0 @@ -# Glossary - -- **Desired State** - - The aggregate of all configuration data for a system form its "Desired State". The "Desired State" of a system is defined as data sufficient to recreate the system so that instances of the system are behaviourally indistinguishable. - -- **State Store** - - A system for storing versioned, immutable Desired States that provides access control and auditing on the changes to the Desired State. Git may be configured as a State Store, but [special precautions must be taken](recipes/SETTING_UP_GIT.md). - -- **Reconciliation** - - The process by which the current state of a system is compared against and made consistent with the system's desired state as declared in the state store - -- **Software System** - - One or more Runtime environments consisting of resources under management. - In each Runtime, management Agents to act on resources according to security policies. - One or more software Repositories for storing deployable artifacts that may be loaded into the runtime environments, eg. configuration files, code, binaries and packages. - One or more Administrators who are responsible for operating the runtime environments ie. installing, starting, stopping and updating software, code, configuration, etc. - A set of policies controlling access and management of repositories, deployments, runtimes. diff --git a/PRINCIPLES.md b/PRINCIPLES.md index d8e1f8e..83e0051 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -1,259 +1,62 @@ - -# GitOps Principles v0.1.0 - -## ⚠️ THIS DOCUMENT IS A WORK IN PROGRESS AND SUBJECT TO PUBLIC REVIEW BEFORE PUBLICATION ⚠️ - -This document provides a concrete definition of GitOps principles. - -The GitOps principles governs how humans and technical systems should interact in order to achieve the desired operational outcomes of repeatability, auditability, and visibility. - -The GitOps Principles are vendor and implementation neutral, they aim to provide a common framework of understanding regarding software operations. - -## Table of Content - - -- [Summary](#summary) -- [Introduction](#introduction) -- [Scope](#scope) -- [The GitOps Principles](#the-gitops-principles) - - [1. Declarative configuration](#1-declarative-configuration) - - [What is a system's Desired State?](#what-is-a-systems-desired-state) - - [Why must the Desired State be declarative data?](#why-must-the-desired-state-be-declarative-data) - - [Why is human readability required?](#why-is-human-readability-required) - - [How much of a system must be declared?](#how-much-of-a-system-must-be-declared) - - [2. Immutable configuration versions](#2-immutable-configuration-versions) - - [What forms a version?](#what-forms-a-version) - - [3. Continuous state reconciliation](#3-continuous-state-reconciliation) - - [4. Operations through declaration](#4-operations-through-declaration) -- [See Also](#see-also) +# GitOps Principles v0.1.0 ## Summary GitOps is a set of principles for operating and managing software systems. -When using GitOps, the desired state of a system or subsystem is defined declaratively as versioned, immutable data, and the running system's configuration is continuously derived from this data. +When using GitOps, the _Desired State_ of a system or subsystem is defined declaratively as versioned, immutable data, and the running system's configuration is continuously derived from this data. -GitOps principles were derived from modern software operations but are rooted in pre-existing and widely adopted best practices. These principles are: +These principles were derived from modern software operations but are rooted in pre-existing and widely adopted best practices. -1. [**The principle of declarative configuration**](#1-declarative-configuration) +## Principles + +1. **The principle of declarative desired state** A system managed by GitOps must have its _Desired State_ expressed declaritively as data in a format writable and readable by both humans and machines. -2. [**The principle of immutable configuration versions**](#2-immutable-configuration-versions) +2. **The principle of immutable desired state versions** _Desired State_ is stored in a way that supports versioning, immutability of versions, and retains a complete version history. - We call systems that store desired state in this way _State Stores_. -3. [**The principle of continuous state reconciliation**](#3-continuous-state-reconciliation) +3. **The principle of continuous state reconciliation** Software agents continuously, and automatically, compare a system's _Actual State_ to its _Desired State_. - If the actual and desired states differ, automated actions are immediately attempted to reconcile them. - These differences could be due to the actual state drifting from the desired state, or the desired state changing intentionally. - -4. [**The principle of operations through declaration**](#4-operations-through-declaration) - - The mechanism through which change is applied to a system by either a human operator or another system is through the creation of a new declarative version of the desired state in the state store, not through directly interacting with the running system. - -## Introduction - -The systems that we manage vary widely; from battery-powered devices driven by microcontrollers to globally distributed systems with millions and sometimes billions of users. - -What is common across all systems is the need for human operators to make changes to the running state, it is during change that systems are most at risk of failure. - -GitOps is a framework in which all change is applied to a system in a consistent mechanism that both reduces risk and improves the situational awareness of human operators working inside the system. - -GitOps can be applied to managing large systems or applied only to the smaller sub-system. - -See also the [Rationale for GitOps](RATIONALE.md). - -## Scope - -GitOps concerns the interaction between humans and technical systems, and between technical systems. -GitOps is not concerned with processes of human decision making or organisation, only how human decisions about technical systems are recorded and applied. -It is a structured process through which technical systems can be modified reliably. - -GitOps is _not_ intended as a model for judging human organisational designs. - -The GitOps principles are to be used as guiding principles in the development of modern software and system operations. They do not form a concrete specification. - -The GitOps principles are a _direction_, **not** a _destination_. They should be applied pragmatically. For example, whilst desirable to apply them strictly to an entire systems, they can also be applied selectively to sub-systems as part of a progressive adoption. - -## The GitOps Principles - -### 1. Declarative configuration - -
-A system managed by GitOps must have its _Desired State_ expressed declaritively as data in a format writable and readable by both humans and machines. -
- -#### What is a system's Desired State? - -_Configuration_ is a common feature of most software systems. -By "Configuration", we mean _data that defines how a system or subsystem should behave_. - -For example, the same web server code may be running on thousands of different servers managed by hundreds of different companies. -The behaviour of an individual webserver will differ based on how it is configured. -Configuration data is typically in the form of files or arguments to a computer program, but some systems may also currently use configuration databases or remote configuration services. Configuration also includes data about what version of code a software system should run, so software version information is also considered configuration. - -Together, the aggregate of all configuration data for a system form its "Desired State". The "Desired State" of a system is defined as data sufficient to recreate the system from nothing so that instances of the system are behaviorally indinstinguishable. - -#### Why must the Desired State be declarative data? - -This is a subtle but important point. -In deference to the work done by the Infrastructure as Code community (IaC), we believe that this was the intent of that movement to begin with. -However, we have in practice seen a misunderstanding in this area, and many implementations have considered imperative scripts or programatic definitions for provisioning infrastructure to be a sufficient implementation of Infrastructure as Code. We disagree. - -We make the distinction explicitly for two important reasons: - -Firstly, it forces a separation of concerns between the Desired State (_what_ a system is) and _how_ the system is made to reach that state. -This modular approach enables the implementation details of operations (the _how_) to be separated and iterated on independently from the system configuration (the _what_). -It also enables different tools and implementations to use the same Desired State declaration and interoperate against a common data language. -These modular systems are more flexible. -For example, a Python program could verify a configuration file, while a C++ executable actually implements the declaration into a running system. Encoding the Desired State with a programming language ties the implementation to it, or forces other components to create a fully featured interpreter. - -Secondly, verifying the correctness and self-consistency of data is significantly less complex than verifying the correctness of a program's behaviour, which is fundamentally undecidable. -Verifying that a set of declarations is correct, however, _is_ decidable, even if sometimes computationally expensive. -As a general rule, this implies that the data language used to defined the Desired State should have no control flow structure, and consist exclusively of referentially transparent expressions without side effects. -In other words, a human-readable data-serialization format. - -It is also preferrable to use a widely supported language to define the Desired State for the sake of interoperability. For example, YAML, XML, JSON and TOML all have broad support and well defined specifications, although there are many other suitable candidates. - -#### Why is human readability required? - -Operationally, it has been proven time and time again that the canonical Desired State of a system should be human-readable and writable. (See _The UNIX Philosophy (1995) by Mike Gancarz_ and _The Pragmatic Programmer (2000) Section 14 "The Power of Plain Text" by Andrew Hunt and David Thomas_) - -The Desired State encodes the _intent_ of a system. For example, where internet traffic should be routed. The collection of decisions about a system captured in its Desired State are of particular interest to its human operators. They must be able to directly describe their intent about the state of a system; and similarly, must be able to decipher the intended state of a system from its configuration. - -This principles not only capture a requirement about the format of the configuration, but also the qualitative readability of configuration _in practice_. For example, a simple YAML or XML file can be easily read, understood and modified by human operators, but experience has taught us that these formats also allow the creation of complex self-referential documents beyond the ability of most humans to interpret. Such complex document would violate this principles, even though the formats are, on the surface, human-readable. - -This relative readability also implies that the users are relevant when evaluating whether a system follows this principle. For example, the Desired State defined with a rich grammar of S-Expressions would suit a team of developers with a background in functional programming but would violate this principle if the majority of the team found the format incomprehensible. - -Having a human-readable Desired State does not in any way preclude the use of rich tooling or graphical interfaces that facilitate Desired State generation and interpretation. It only require that the canonical source of truth be human readable and writable plain text. - -#### How much of a system must be declared? - -Ideally, all of it; and the entire system can be recreated exclusively from its Desired State. - -The definition of a _system_ can be quite broad, and may incorporate human as well as programmatic processes. -For example, is a company's sales process a system? -Should GitOps be applied to it? -Although a vision in which the GitOps practices are applied generally to all processes, human or otherwise is compelling, a more pragmatic approach is preferable, to avoid the risk of attempting to "boil the ocean". - -Instead, we should focus on subsystems where the Desired State is well defined, implement the GitOps principles there, and grow out to capture more systems from that initial subsystem. - -### 2. Immutable configuration versions - -
-Desired State is stored in a way that supports versioning, immutability of versions, and retains a complete version history. -We call systems that store Desired State in this way State Stores. -
- -#### What forms a version? - -A version is the Desired State for a system as a whole. It is the canonical form of what we desire the system to be at a point in time. - -It is insufficient to version part of the Desired State or to version these parts in separate State Stores. -In practice, software systems often have overarching behaviour that is the result of coupling between components. -If the Desired State of these components were to change independently, it would be difficult to map a change in observed behaviour of our system to a single change in Desired State. -Being able to make this 1:1 mapping is operationally beneficial, as we can then map behavioural issues of our system directly to the changes that occured. -The utility of having the entire system defined in a single canonical location grows in proportion to the complexity and internal coupling of the system. -A web of references to configuration data located in different locations is undesirable, as it makes understanding the desired state particularly difficult. - -Versions should be uniquely named. This need not be a semantically meaningful name. It is sufficient that each new version is attributed a name that identifies it uniquely. Once a new version has been created, it should be immutable. By this we mean that it should be impossible to modify the relationship between a version's unique name and its value of the Desired State. - -All but the very first version should reference a predecessor or parent, which is another uniquely named version. This enables us to retain a history of the changes. - - - -### 3. Continuous state reconciliation - -
-Software agents continuously, and automatically, compare a system's Actual State to its Desired State. -If the actual and desired states differ, automated actions are immediately attempted to reconcile them. -These differences could be due to the actual state drifting from the desired state, or the desired state changing intentionally. -
+ If the actual and desired states differ for any reason, automated actions to reconcile them are initiated. - +4. **The principle of operations through declaration** + The only mechanism through which the system is intentionally operated on is through these principles. -### 4. Operations through declaration +## Notes -
-The mechanism through which change is applied to a system by either a human operator or another system is through the creation of a new declarative version of the desired state in the state store, not through directly interacting with the running system. -
- +### Principle 3 Notes - + One or more Runtime environments consisting of resources under management. + In each Runtime, management Agents to act on resources according to security policies. + One or more software Repositories for storing deployable artifacts that may be loaded into the runtime environments, eg. configuration files, code, binaries and packages. + One or more Administrators who are responsible for operating the runtime environments ie. installing, starting, stopping and updating software, code, configuration, etc. + A set of policies controlling access and management of repositories, deployments, runtimes. -## See Also +- **Declarative Description** -- [The GitOps Glossary](GLOSSARY.md) + Describing the desired state or behavior of a system without specifying how that state will be achieved, thereby separating between configuration - the desired state - and implementation - the commands, API calls, scripts ... that actually achieve the desired state described in the declarative description. diff --git a/RATIONALE.md b/RATIONALE.md deleted file mode 100644 index 92a8395..0000000 --- a/RATIONALE.md +++ /dev/null @@ -1,49 +0,0 @@ -# Rationale for GitOps - -Currently, many software system's desired state is not defined separately from the running system. -When we desire the behaviour of a software system to change, we modify the system's configuration directly, either through human action, or by running scripts that take a set of predetermined action on the system. - -This leads to several serious issues: - -- **Detecting drift from desired state** - - If the desired state of a system is not explicitly defined, it is impossible to verify if the system in a correct state. The state of a running system itself does not provide sufficient information to determine its correctness. - - Consider logging into an administration console and seeing that 28 machines are healthily running. - Is this good? Is this bad? That very much depends on what the desired number of machines is. - For example, these could be test machines that should have been deleted and are now incurring a significant cost for no reasons, or all that remains of a 100 machine datacenter. - We could consult the documentation, or expect the human operator to know, but by the time this occurs, our system has been in an incorrect state for a significant period of time. _The validity of the state of our system is not automatically verifiable_. - - This problem leads to the conclusion that the canonical source of truth for desired state cannot be the actual state. - Making the desired state of a system explicit prevents this problem. - -- **Recovering from transitional states** - - If we have no record of the desired state of our system, how can we recover from failures that occur in the transition between states? - Such transitions are very common lifecycle events, such as upgrades, new features being released or scaling our resources. These are where the majority of hard software defects and transient errors occur. - Not only are such failures extremely common, but their likelihood grows quadratically with the number of components in our system, as failure may not only occur in each component, but also in the connections between components, which can fail independently. - - When these failures occur, they leave our system in an indeterminate transitional state. Going back to a known good state or forward to a new desired states is difficult, as we don't have a record of what these should be, only lists of instructions - - We are left with the option of backing up our system state before a change so that we can restore a known good state if something goes wrong. - This doesn't solve the problem of how to change our system to move to a new desired state. The best we can do is restore and retry. - This becomes extremely common in complex distributed systems, where transient failures are usual. - Such an approach is painful in practice and leads to an aversion to changing the system's state. - - This problem can be solved by having software agents that continuously converge the system towards a well-defined state. - -- **Controlling and auditing actors, access and actions** - - In most cases, we not only require the ability to change a running system safely, but we must also record what was changed, when, by whom, and why and enforce rules about which changes we allow. - If our systems are changed through direct access, the surface area of the interface to control and monitor can quickly become overwhelmingly large. - - Access control at different levels, from the network to the application layer must be controlled and audited. A coherent set of access control rules must be applied across varied systems configured in completely different, and sometimes incompatible ways. - An audit trail may or may not be required for regulatory or governance reasons, but it is such a common requirement of managing software systems that it must be also be addressed. - - Having a single source of truth regarding the desired state of our system becomes a ledger of transactions between states and a single point of operation. This leads to a natural place to enforce rules regarding access and to audit changes. - - Furthermore, by removing direct access, the credentials used to manage a system can be constrained to exist only within the security boundary of the system itself, which can poll its desired state. This greatly reduces the security surface area. - -These are only a small sample of the issues that arise if we do not define the desired state of our software systems explicitly and declaratively. We believe GitOps is a solution to these many issues that plague software operations at scale. - -Since these issues occur so universally when operating software systems, we also believe that GitOps is fundamentally agnostic about specific tools. that is, the GitOps principles are universally applicable, and independent of any particular tool, solution or practice, including Git itself, after which they are named. From 2b066acf1ea06ee8eb8e16aaa4042de7528ec4e8 Mon Sep 17 00:00:00 2001 From: Scott Rigby Date: Fri, 14 May 2021 13:43:15 -0400 Subject: [PATCH 16/19] Move continuous caveat from notes to glossary Per last Principles Committee meeting. Also anchor glossary items Signed-off-by: Scott Rigby --- PRINCIPLES.md | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/PRINCIPLES.md b/PRINCIPLES.md index 83e0051..a383245 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -33,7 +33,6 @@ These principles were derived from modern software operations but are rooted in - These differences could be due to the actual state drifting from the desired state, or the desired state changing intentionally. - The source of drift doesn't matter. Contrary to CIops, _any_ drift will trigger a reconciliation -- By "continuous" we adopt the industry standard term to mean reconcilation continues to happen, not that it must be instantaneous ### Principle 4 Notes @@ -41,15 +40,19 @@ These principles were derived from modern software operations but are rooted in ## Glossary -- **State Store** +- ### Continuous - A system for storing versioned, immutable Desired States that provides access control and auditing on the changes to the Desired State. Git may be configured as a State Store, but [special precautions must be taken](recipes/SETTING_UP_GIT.md). + By "continuous" we adopt the industry standard term to mean reconcilation continues to happen, not that it must be instantaneous. + +- ### Declarative Description + + Describing the desired state or behavior of a system without specifying how that state will be achieved, thereby separating between configuration - the desired state - and implementation - the commands, API calls, scripts ... that actually achieve the desired state described in the declarative description. -- **Desired State** +- ### Desired State The aggregate of all configuration data for a system form its _Desired State_ which is defined as data sufficient to recreate the system so that instances of the system are behaviourally indistinguishable. -- **Software System** +- ### Software System One or more Runtime environments consisting of resources under management. In each Runtime, management Agents to act on resources according to security policies. @@ -57,6 +60,6 @@ These principles were derived from modern software operations but are rooted in One or more Administrators who are responsible for operating the runtime environments ie. installing, starting, stopping and updating software, code, configuration, etc. A set of policies controlling access and management of repositories, deployments, runtimes. -- **Declarative Description** +- #### State Store - Describing the desired state or behavior of a system without specifying how that state will be achieved, thereby separating between configuration - the desired state - and implementation - the commands, API calls, scripts ... that actually achieve the desired state described in the declarative description. + A system for storing versioned, immutable Desired States that provides access control and auditing on the changes to the Desired State. Git may be configured as a State Store, but [special precautions must be taken](recipes/SETTING_UP_GIT.md). From b11220f0f97ae54f1865871bffe3951067699de7 Mon Sep 17 00:00:00 2001 From: Scott Rigby Date: Sat, 15 May 2021 21:26:46 -0400 Subject: [PATCH 17/19] typos Signed-off-by: Scott Rigby Co-authored-by: lloydchang --- PRINCIPLES.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PRINCIPLES.md b/PRINCIPLES.md index a383245..2e97d3c 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -12,7 +12,7 @@ These principles were derived from modern software operations but are rooted in 1. **The principle of declarative desired state** - A system managed by GitOps must have its _Desired State_ expressed declaritively as data in a format writable and readable by both humans and machines. + A system managed by GitOps must have its _Desired State_ expressed declaratively as data in a format writable and readable by both humans and machines. 2. **The principle of immutable desired state versions** From 567e4b6ae47822718b90f074631a6339f7424036 Mon Sep 17 00:00:00 2001 From: Scott Rigby Date: Sat, 15 May 2021 21:27:16 -0400 Subject: [PATCH 18/19] misspellings and grammar Signed-off-by: Scott Rigby Co-authored-by: lloydchang --- PRINCIPLES.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PRINCIPLES.md b/PRINCIPLES.md index 2e97d3c..613c3d4 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -36,7 +36,7 @@ These principles were derived from modern software operations but are rooted in ### Principle 4 Notes -- We talk here about "regular operations". In an emergency also other mode of operation, e.g. manual intervention, should be considered - followed by a reconsiliation of the "tainted" system with the declared state. → resolve the conflict between "GitOps principle" and "I need to deal with problems that GitOps doesn't cover" +- We talk here about "regular operations." In an emergency, other modes of operations, e.g. manual intervention, should be considered - followed by a reconciliation of the "tainted" system with the declared state. → resolve the conflict between "GitOps principle" and "I need to deal with problems that GitOps doesn't cover" ## Glossary From efff2de06f5920a9272fd8dde4d64204b7f21102 Mon Sep 17 00:00:00 2001 From: Scott Rigby Date: Sat, 15 May 2021 21:27:46 -0400 Subject: [PATCH 19/19] typos Signed-off-by: Scott Rigby Co-authored-by: lloydchang --- PRINCIPLES.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PRINCIPLES.md b/PRINCIPLES.md index 613c3d4..bca5371 100644 --- a/PRINCIPLES.md +++ b/PRINCIPLES.md @@ -42,7 +42,7 @@ These principles were derived from modern software operations but are rooted in - ### Continuous - By "continuous" we adopt the industry standard term to mean reconcilation continues to happen, not that it must be instantaneous. + By "continuous" we adopt the industry standard term to mean reconciliation continues to happen, not that it must be instantaneous. - ### Declarative Description