diff --git a/content/docs/05.concepts/02.namespace-files.md b/content/docs/05.concepts/02.namespace-files.md index 3a252d129d..0bfed7b149 100644 --- a/content/docs/05.concepts/02.namespace-files.md +++ b/content/docs/05.concepts/02.namespace-files.md @@ -11,9 +11,9 @@ Manage Namespace Files and how to use them in your flows. ## What are Namespace Files -Namespace Files are files tied to a given namespace. You can think of Namespace Files as equivalent of a project in your local IDE or a copy of your Git repository. +Namespace Files are files tied to a given namespace. You can think of Namespace Files as the equivalent of a project in your local IDE or a copy of your Git repository. -Namespace Files can hold Python files, R or Node.js scripts, SQL queries, dbt or Terraform projects, and many more. +Namespace Files can hold Python files, R or Node.js scripts, SQL queries, dbt or Terraform projects, and much more. You can synchronize your Git repository with a specific namespace to orchestrate dbt, Terraform or Ansible, or any other project that contains code and configuration files. @@ -48,7 +48,7 @@ triggers: ``` ::alert{type="info"} -Note: we didn't have to use the `namespaceFiles.enabled: true` property — that property is only required to inject the entire directory of files from the namespace into the working directory of a script (e.g. a Python task). More on that in the subsequent sections of this page. +Note: we didn't have to use the `namespaceFiles.enabled: true` property — that property is only required to inject the entire directory of files from the namespace into the working directory of a script (e.g., a Python task). There are more details in the subsequent sections of this page. :: ## Why use Namespace Files @@ -56,20 +56,20 @@ Note: we didn't have to use the `namespaceFiles.enabled: true` property — that Namespace Files offer a simple way to organize your code and configuration files. Before Namespace Files, you had to store your code and configuration files in a Git repository and then clone that repository at runtime using the `git.Clone` task. With Namespace Files, you can store your code and configuration files directly in the Kestra's internal storage backend. That storage backend can be your local directory or an S3 bucket to ensure maximum security and privacy. Namespace Files make it easy to: -- orchestrate Python, R, Node.js, SQL, and more, without having to worry about code dependencies, packaging and deployments — simply add your code in the embedded Code Editor or sync your Git repository with a given namespace -- manage your code for a given project or team in one place, even if those files are stored in different Git repositories, or even different Git providers +- orchestrate Python, R, Node.js, SQL, and more without having to worry about code dependencies, packaging, and deployments — simply add your code in the embedded Code Editor or sync your Git repository with a given namespace +- manage your code for a given project or team in one place, even if those files are stored in different Git repositories or even different Git providers - share your code and configuration files between workflows and team members in your organization -- orchestrate complex projects that require the code to be separated into multiple scripts, queries or modules. +- orchestrate complex projects that require the code to be separated into multiple scripts, queries, or modules. ## How to add Namespace Files ### Embedded Code Editor -The easiest way to get started with Namespace Files is to use the embedded Code Editor. This allows you to easily add custom scripts, queries and configuration files along with your flow YAML configuration files. +The easiest way to get started with Namespace Files is to use the embedded Code Editor. This allows you to easily add custom scripts, queries, and configuration files along with your flow YAML configuration files. -Get started by selecting a namespace from the dropdown menu. If you type a name of a namespace that doesn't exist yet, Kestra will create it for you. +Get started by selecting a namespace from the dropdown menu. If you type a name of a namespace that doesn't exist yet, Kestra creates it for you. -Then, add a new file, e.g., a Python script. Add a folder named `scripts` and a file called `hello.py` with the following content: +Next, add a new file, (e.g., a Python script). Add a folder named `scripts` and a file called `hello.py` with the following content: ```python print("Hello from the Editor!") @@ -90,13 +90,13 @@ tasks: - python scripts/hello.py ``` -The `Execute` button allows you to run your flow directly from the Code Editor. Click on the `Execute` button to run your flow. You should then see the Execution being created in a new browser tab and once you navigate to the `Logs` tab, you should see a friendly message ``Hello from the Editor!`` in the logs. +The **Execute** button allows you to run your flow directly from the Code Editor. Click on the **Execute** button to run your flow. You then see the Execution being created in a new browser tab, and once you navigate to the **Logs** tab, you should see a friendly message ``Hello from the Editor!`` in the logs. -### PushNamespaceFiles and SyncNamespaceFiles Tasks +### PushNamespaceFiles and SyncNamespaceFiles tasks -There's 2 tasks to help you automatically manage your namespace files with Git. This allows you to sync the latest changes from a Git repository. +There are two tasks to help you automatically manage your namespace files with Git. This allows you to sync the latest changes from a Git repository. -This example will push Namespace Files you already have in Kestra to a Git repository for you: +This example pushes Namespace Files you already have in Kestra to a Git repository for you: ```yaml id: push_to_git @@ -117,7 +117,7 @@ tasks: dryRun: true ``` -This example will sync Namespace Files inside of a Git repository to your Kestra instance: +This example syncs Namespace Files inside of a Git repository to your Kestra instance: ```yaml id: sync_files_from_git @@ -143,7 +143,7 @@ Check out the dedicated guides for more information: You can leverage our official GitHub Action called [deploy-action](https://github.com/kestra-io/deploy-action) to synchronize your Git repository with a given namespace. This is useful if you want to orchestrate complex Python modules, dbt projects, Terraform or Ansible infrastructure, or any other project that contains code and configuration files with potentially multiple nested directories and files. -Here is a simple example showing how you can deploy all scripts from the `scripts` directory in your Git branch to the `prod` namespace: +Below is a simple example showing how you can deploy all scripts from the `scripts` directory in your Git branch to the `prod` namespace: ```yaml name: Kestra CI/CD @@ -176,7 +176,7 @@ When creating a service account role for the GitHub Action in the [Enterprise Ed You can use the `kestra_namespace_file` resource from the official [Kestra Terraform Provider](https://registry.terraform.io/providers/kestra-io/kestra/latest/docs) to deploy all your custom script files from a specific directory to a given Kestra namespace. -Here is a simple example showing how you can synchronize an entire directory of scripts from the directory `src` with the `company.team` namespace using Terraform: +Below is a simple example showing how you can synchronize an entire directory of scripts from the directory `src` with the `company.team` namespace using Terraform: ```hcl resource "kestra_namespace_file" "prod_scripts" { @@ -189,15 +189,15 @@ resource "kestra_namespace_file" "prod_scripts" { ### Deploy Namespace Files from Git via CLI -You can also use the Kestra CLI to deploy all your custom script files from a specific directory to a given Kestra namespace. Here is a simple example showing how you can synchronize an entire directory of local scripts with the `prod` namespace using the Kestra CLI: +You can also use the Kestra CLI to deploy all your custom script files from a specific directory to a given Kestra namespace. Below is a simple example showing how you can synchronize an entire directory of local scripts with the `prod` namespace using the Kestra CLI: ```bash ./kestra namespace files update prod /Users/anna/gh/KESTRA_REPOS/scripts --server=http://localhost:8080 --user=rick:password ``` -In fact, you can even use that command directly in a flow. You can attach a schedule or a webhook trigger to automatically execute that flow anytime you push/merge changes to your Git repository, or on a regular schedule. +In fact, you can even use that command directly in a flow. You can attach a schedule or a webhook trigger to automatically execute that flow anytime you push/merge changes to your Git repository or on a regular schedule. -Here is an example of a flow that synchronizes an entire directory of local scripts with the `prod` namespace: +Below is an example of a flow that synchronizes an entire directory of local scripts with the `prod` namespace: ```yaml id: ci @@ -224,7 +224,7 @@ tasks: - /app/kestra namespace files update prod . . --server={{vars.host}} ``` -Note that the two dots in the command `/app/kestra namespace files update prod . .` indicate that we want to sync an entire directory of files cloned from the Git repository to the root directory of the `prod` namespace. If you wanted to e.g. sync that repository to the `scripts` directory, you would use the following command: `/app/kestra namespace files update prod . scripts`. The syntax of that command follows the structure: +Note that the two dots in the command `/app/kestra namespace files update prod . .` indicate that we want to sync an entire directory of files cloned from the Git repository to the root directory of the `prod` namespace. If you wanted to sync that repository to the `scripts` directory, you would use the following command: `/app/kestra namespace files update prod . scripts`. The syntax of that command follows the structure: ```bash /app/kestra namespace files update @@ -236,20 +236,20 @@ To reproduce that flow, start Kestra using the following command: docker run --pull=always --rm -it -p 28080:8080 kestra/kestra:latest server local ``` -Then, open the Kestra UI at `http://localhost:28080` and create a new flow with the content above. Once you execute the flow, you should see the entire directory from the `scripts` repository being synchronized with the `prod` namespace. +Next, open the Kestra UI at `http://localhost:28080` and create a new flow with the content above. Once you execute the flow, you then see the entire directory from the `scripts` repository being synchronized with the `prod` namespace. ## How to use Namespace Files in your flows -There are multiple ways to use Namespace Files in your flows. You can use the `read()` function to read the content of a file as a string, point to the file path in the supported tasks or use a dedicated task to retrieve it as an output. +There are multiple ways to use Namespace Files in your flows. You can use the `read()` function to read the content of a file as a string, point to the file path in the supported tasks, or use a dedicated task to retrieve it as an output. -Usually, pointing to a file location, rather than reading the file's content, is required when you want to use a file as an input to a CLI command, e.g. in a `Commands` task such as `io.kestra.plugin.scripts.python.Commands` or `io.kestra.plugin.scripts.node.Commands`. In all other cases, the `read()` function can be used to read the content of a file as a string e.g. in `Query` or `Script` tasks. +Usually, pointing to a file location, rather than reading the file's content, is required when you want to use a file as an input to a CLI command (e.g., in a `Commands` task such as `io.kestra.plugin.scripts.python.Commands` or `io.kestra.plugin.scripts.node.Commands`). In all other cases, the `read()` function can be used to read the content of a file as a string (e.g., in `Query` or `Script` tasks). -You can also use the `io.kestra.plugin.core.flow.WorkingDirectory` task to read namespace files there and then use them in child tasks that require reading the file path in CLI commands e.g. `python scipts/hello.py`. +You can also use the `io.kestra.plugin.core.flow.WorkingDirectory` task to read namespace files there and then use them in child tasks that require reading the file path in CLI commands for example like: `python scipts/hello.py`. ### The `read()` function -Note how the script in the first section used the `read()` function to read the content of the `scripts/hello.py` file as a string using the expression `"{{ read('scripts/hello.py') }}"`. It'a important to remeber that this function reads **the content of the file as a string**. Therefore, you should use that function only in tasks that expect a string as an input, e.g., `io.kestra.plugin.scripts.python.Script` or `io.kestra.plugin.scripts.node.Script`, rather than `io.kestra.plugin.scripts.python.Commands` or `io.kestra.plugin.scripts.node.Commands`. +Note how the script in the first section used the `read()` function to read the content of the `scripts/hello.py` file as a string using the expression `"{{ read('scripts/hello.py') }}"`. It's important to remember that this function reads **the content of the file as a string**. Therefore, you should use that function only in tasks that expect a string as an input like `io.kestra.plugin.scripts.python.Script` or `io.kestra.plugin.scripts.node.Script`, rather than `io.kestra.plugin.scripts.python.Commands` or `io.kestra.plugin.scripts.node.Commands`. The `read()` function allows you to read the content of a Namespace File stored in the Kestra's internal storage backend. The `read()` function takes a single argument, which is the absolute path to the file you want to read. The path must point to a file stored in the **same namespace** as the flow you are executing. @@ -269,7 +269,7 @@ tasks: With supported tasks, such as the `io.kestra.plugin.scripts` group, we can access files using their path and enabling the task to read namespace files. -Here is a simple `weather.py` script that reads a secret to talk to a Weather Data API: +Below is a simple `weather.py` script that reads a secret to talk to a Weather Data API: ```python import requests @@ -279,7 +279,7 @@ weather_data = requests.get(url) print(weather_data.json()) ``` -And here is the flow that uses the script: +Next, is a flow that uses the script: ```yaml id: weather_data namespace: company.team @@ -306,14 +306,14 @@ We can control what namespace files are available to our flow with the `namespac `namespaceFiles` has 3 attributes: - `enabled`: when set to true enables all files in that namespace to be visible to the task -- `include`: allows you to specify files you want to be accessible by the task -- `exclude`: allows you to specify files you don't want to be accessible by the task +- `include`: specifies files you want to be accessible by the task +- `exclude`: specifies files you don't want to be accessible by the task ### Namespace Tasks -You can use the Namespace Tasks to upload, download and delete tasks in Kestra. +You can use the Namespace Tasks to upload, download, and delete tasks in Kestra. -In the example below, we have a namespace file called `example.ion` that we want to convert to a csv file. We can use the `DownloadFiles` task to generate an output that contains the file so we can easily pass it dynamically to the `IonToCsv` task. +In the example below, we have a namespace file called `example.ion` that we want to convert to a `.csv` file. We can use the `DownloadFiles` task to generate an output that contains the file so we can easily pass it dynamically to the `IonToCsv` task. ```yaml id: files @@ -339,7 +339,7 @@ Read more about the tasks below: You can selectively include or exclude namespace files. -Say you have multiple namespace files present: file1.txt, file2.txt, file3.json, file4.yml. You can selectively include multiple files using `include` attribute under `namespaceFiles` as shown below: +Let's say that you have multiple namespace files present: file1.txt, file2.txt, file3.json, file4.yml. You can selectively include multiple files using the `include` attribute under `namespaceFiles` as shown below: ```yaml id: include_namespace_files @@ -357,9 +357,9 @@ tasks: - ls ``` -The `include_files` task will list all the included files, i.e. `file1.txt` and `file3.json` as only those got included from the namespace through `include`. +The `include_files` task lists all the included files. In the example above, these are `file1.txt` and `file3.json` as only those were included from the namespace through `include`. -The `exclude`, on the other hand, includes all the namespace files except those specified under `exclude`. +The `exclude` attribute, alternatively, includes all the namespace files except those specified under `exclude`. ```yaml id: exclude_namespace_files @@ -377,4 +377,4 @@ tasks: - ls ``` -The `exclude_files` task from the above flow will list `file2.txt` and `file4.yml`, i.e. all the namespace files except those that were excluded using `exclude`. +The `exclude_files` task from the above flow lists `file2.txt` and `file4.yml`, all the namespace files except those that were excluded using `exclude`. diff --git a/content/docs/05.concepts/04.secret.md b/content/docs/05.concepts/04.secret.md index be954fbf0a..ad838cb7e3 100644 --- a/content/docs/05.concepts/04.secret.md +++ b/content/docs/05.concepts/04.secret.md @@ -13,7 +13,7 @@ Secret is a mechanism that allows you to securely store sensitive information, s --- -To retrieve secrets in a flow, use the `secret()` function, e.g. `"{{ secret('API_TOKEN'') }}"`. You can leverage your existing secrets manager as a secrets backend. +To retrieve secrets in a flow, use the `secret()` function, e.g., `"{{ secret('API_TOKEN'') }}"`. You can leverage your existing secrets manager as a secrets backend. Your flows often need to interact with external systems. To do that, they need to programmatically authenticate using passwords or API keys. Secrets help you securely store such variables and avoid hard-coding sensitive information within your workflow code. @@ -24,7 +24,7 @@ You can leverage the `secret()` function to retrieve sensitive variables within ### Adding a new Secret from the UI -If you are using a managed Kestra version, you can add **new Secrets** directly from the UI. In the left navigation menu, go to **Namespaces**, select the namespace to which you want to add a new secret. Then, add a new secret within the Secrets tab. +If you are using a managed Kestra version, you can add **new Secrets** directly from the UI. In the left navigation menu, go to **Namespaces** and select the namespace to which you want to add a new secret. Next, add a new secret within the Secrets tab. ![Secrets EE](/docs/developer-guide/secrets/secrets-ee-1.png) @@ -56,13 +56,13 @@ Imagine that so far, you were setting the following environment variable: export MYPASSWORD=myPrivateCode ``` -Here is how you can encode the sensitive value of that environment variable: +Below is how you can encode the sensitive value of that environment variable: ```bash echo -n "myPrivateCode" | base64 ``` -This should output the value: `bXlQcml2YXRlQ29kZQ==` +This outputs the value: `bXlQcml2YXRlQ29kZQ==` To use that value as a Secret in your Kestra instance, you would need to add a prefix `SECRET_` to the variable key (here: `SECRET_MYPASSWORD`) and set that key to the encoded value: @@ -70,7 +70,7 @@ To use that value as a Secret in your Kestra instance, you would need to add a p export SECRET_MYPASSWORD=bXlQcml2YXRlQ29kZQ== ``` -If you would add the environment variable to the `kestra` container section in a [Docker Compose file](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml#L22), it would look as follows: +If you want to add the environment variable to the `kestra` container section in a [Docker Compose file](https://github.com/kestra-io/kestra/blob/develop/docker-compose.yml#L22), it would look as follows: ```yaml kestra: @@ -79,17 +79,17 @@ If you would add the environment variable to the `kestra` container section in a SECRET_MYPASSWORD: bXlQcml2YXRlQ29kZQ== ``` -This secret can then be used in a flow using the `{{ secret('MYPASSWORD') }}` syntax, and it will base64-decoded during flow execution. Make sure to not include the prefix `SECRET_` when calling the `secret('MYPASSWORD')` function, as this prefix is only there in the environment variable definition to prevent Kestra from treating other system variables as secrets (for better performance and increased security). +This secret can be used in a flow using the `{{ secret('MYPASSWORD') }}` syntax, and it will be base64-decoded during flow execution. Make sure to not include the prefix `SECRET_` when calling the `secret('MYPASSWORD')` function, as this prefix is only there in the environment variable definition to prevent Kestra from treating other system variables as secrets (for better performance and increased security). -Lastly, shall you wish to reference any non_encoded environment variables in your flows definition, you can always use the syntax `{{envs.lowercase_environment_variable_key}}`. +Lastly, if you want to reference any non_encoded environment variables in your flows definition, you can always use the syntax `{{envs.lowercase_environment_variable_key}}`. ::alert{type="warning"} -Note that Kestra has built-in protection to prevent its logs from revealing any encoded secret you would have defined. +Note that Kestra has built-in protection to prevent its logs from revealing any encoded secret you have defined. :: ### Convert all variables in an `.env` file -The previous section showed the process for one Secret. But what if you have tens or hundreds of them? This is where `.env` file can come in handy. +The previous section showed the process for one Secret, but if you have tens or hundreds of them, then the `.env` is better suited. Let's assume that you have an `.env` file with the following content: @@ -101,7 +101,7 @@ AWS_SECRET_ACCESS_KEY=myawssecretaccesskey ``` -Make sure to keep the last line empty, otherwise the bash script below won't encode the last secret AWS_SECRET_ACCESS_KEY correctly. +Make sure to keep the last line empty, otherwise the bash script below won't encode the last secret `AWS_SECRET_ACCESS_KEY` correctly. Using the bash script shown below, you can: 1. Encode all values using base64-encoding @@ -144,7 +144,7 @@ with the encoded version of the file: ### Use a macro within your `.env` file -As an alternative to replacing values in your environment variables by encoded counterparts, you may also leverage the `base64encode` macro and keep the values intact. +As an alternative to replacing values in your environment variables with encoded counterparts, you can also leverage the `base64encode` macro and keep the values intact. The original `.env` file: diff --git a/content/docs/05.concepts/05.kv-store.md b/content/docs/05.concepts/05.kv-store.md index 8247252d68..d1a94c30d1 100644 --- a/content/docs/05.concepts/05.kv-store.md +++ b/content/docs/05.concepts/05.kv-store.md @@ -15,11 +15,11 @@ Build stateful workflows with the KV Store. ## Overview -Kestra's workflows are stateless by design. All workflow executions and task runs are isolated from each other by default to avoid any unintended side effects. When you pass data between tasks, you do so explicitly by passing outputs from one task to another and that data is stored transparently in Kestra's internal storage. This stateless execution model ensures that workflows are idempotent and can be executed anywhere in parallel at scale. +Kestra's workflows are stateless by design. All workflow executions and task runs are isolated from each other by default to avoid any unintended side effects. When you pass data between tasks, you do so explicitly by passing outputs from one task to another, and that data is stored transparently in Kestra's internal storage. This stateless execution model ensures that workflows are idempotent and can be executed anywhere in parallel at scale. However, in certain scenarios, your workflow might need to share data beyond passing outputs from one task to another. For example, you might want to persist data across executions or even across different workflows. This is where the Key Value (KV) store comes into play. -KV Store allows you to store any data in a convenient key-value format. You can create them directly from the UI, via dedicated tasks, Terraform or through the API. +KV Store allows you to store any data in a convenient key-value format. You can create them directly from the UI, via dedicated tasks, Terraform, or through the API. The KV store is a powerful tool that allows you to build stateful workflows and share data across executions and workflows. @@ -35,10 +35,10 @@ In short, the KV Store gives you full control and privacy over your data, and Ke `Keys` are arbitrary strings. Keys can contain: -* characters in uppercase and or lowercase -* standard ASCII characters +- characters in uppercase and or lowercase +- standard ASCII characters -`Values` are stored as ION files in Kestra's internal storage. Values are strongly typed, and can be of one of the following types: +`Values` are stored as ION files in Kestra's internal storage. Values are strongly typed and can be of one of the following types: - string - number @@ -52,13 +52,13 @@ For each KV pair, you can set a `Time to Live` (TTL) to avoid cluttering your st ## Namespace binding -Key value pairs are defined at a namespace level and you can access them from the namespace page in the UI in the KV Store tab. +Key value pairs are defined at a namespace level, and you can access them from the namespace page in the UI in the KV Store tab. You can create and read KV pairs across namespaces as long as those namespaces are [allowed](../06.enterprise/allowed-namespaces.md). ## UI: How to Create, Read, Update and Delete KV pairs from the UI -Kestra follows a philosophy of Everything as Code and from the UI. Therefore, you can create, read, update, and delete KV pairs both from the UI and Code. +Kestra follows a philosophy of Everything as Code and also from the UI. Therefore, you can create, read, update, and delete KV pairs both from the UI and Code. Here is a list of the different ways to manage KV pairs: 1. **Kestra UI**: select a Namespace and go to the KV Store tab — from here, you can create, edit, and delete KV pairs. @@ -86,7 +86,7 @@ You can create, read, update, and delete KV pairs from the UI in the following w ### Update and Delete KV pairs from the UI -You can edit or delete any KV pair by clicking on the `Edit` button on the right side of each KV pair. +You can edit or delete any KV pair by clicking on the **Edit** button on the right side of each KV pair. ![edit_delete_kv_pair](/docs/concepts/kv-store/edit_delete_kv_pair.png) @@ -94,7 +94,7 @@ You can edit or delete any KV pair by clicking on the `Edit` button on the right ### Create a new KV pair with the `Set` task in a flow -To create a KV pair from a flow, you can use the `io.kestra.plugin.core.kv.Set` task. Here's an example of how to create a KV pair in a flow: +To create a KV pair from a flow, you can use the `io.kestra.plugin.core.kv.Set` task. Below is an example of how to create a KV pair in a flow: ```yaml id: add_kv_pair @@ -141,7 +141,7 @@ You can use the `io.kestra.plugin.core.kv.Set` task to create or modify any KV p The easiest way to retrieve a value by key is to use the `{{ kv('YOUR_KEY'') }}` Pebble function. -Here is the full syntax of that function: +Below is the full syntax of that function: ``` {{ kv(key='your_key_name', namespace='your_namespace_name', errorOnMissing=false) }} @@ -180,7 +180,7 @@ tasks: format: "{{ kv('non_existing_key', errorOnMissing=true) }}" ``` -The function arguments such as the `errorOnMissing` keyword can be skipped for brevity as long as you fill in all positional arguments i.e. `{{ kv(key='your_key_name', namespace='your_namespace_name', errorOnMissing=false) }}` — the version below will have the same effect: +The function arguments such as the `errorOnMissing` keyword can be skipped for brevity as long as you fill in all positional arguments i.e., `{{ kv(key='your_key_name', namespace='your_namespace_name', errorOnMissing=false) }}` — the version below has the same effect: {{ kv(key='my_key', namespace='company.team') }} ```yaml id: read_non_existing_kv_pair @@ -193,9 +193,9 @@ tasks: ### Read KV pairs with the `Get` task -You can also retrieve the value of any KV pair using the `Get` task. The `Get` task will produce the `value` output, which you can use in subsequent tasks. This option is a little more verbose but it has two benefits: -1. More declarative syntax. -2. Useful when you need to pass the current state of that value to multiple downstream tasks. +You can also retrieve the value of any KV pair using the `Get` task. The `Get` task produces the `value` output, which you can use in subsequent tasks. This option is a little more verbose, but it has two benefits: +1. More declarative syntax +2. Useful when you need to pass the current state of that value to multiple downstream tasks ```yaml id: get_kv_pair @@ -215,7 +215,7 @@ tasks: ### Read and parse JSON-type values from KV pairs -To parse JSON values in Kestra's templated expressions, make sure to wrap the `kv()` call in the `json()` function, e.g. `"{{ json(kv('your_json_key')).json_property }}"`. +To parse JSON values in Kestra's templated expressions, make sure to wrap the `kv()` call in the `json()` function like the following: `"{{ json(kv('your_json_key')).json_property }}"`. The following example demonstrates how to parse values from JSON-type KV pairs in a flow: ```yaml @@ -273,11 +273,11 @@ tasks: message: "{{ outputs.get.keys }}" ``` -The output will be a list of keys - if no keys were found, an empty list will be returned. +The output is a list of keys - if no keys were found, an empty list will be returned. ### Delete a KV pair with the `Delete` task -The `io.kestra.plugin.core.kv.Delete` task will produce the boolean output `deleted` to confirm whether a given KV pair was deleted or not. +The `io.kestra.plugin.core.kv.Delete` task produces the boolean output `deleted` to confirm whether a given KV pair was deleted or not. ```yaml id: delete_kv_pair @@ -315,7 +315,7 @@ For example: curl -X PUT -H "Content-Type: application/json" http://localhost:8080/api/v1/namespaces/company.team/kv/my_key -d '"Hello World"' ``` -The above `curl` command will create the KV pair with key `my_key` and the `Hello World` string value in the `company.team` namespace. The API does not return any response. +The above `curl` command creates the KV pair with key `my_key` and the `Hello World` string value in the `company.team` namespace. The API does not return any response. ### Read the value by key @@ -331,7 +331,7 @@ For example: curl -X GET -H "Content-Type: application/json" http://localhost:8080/api/v1/namespaces/company.team/kv/my_key ``` -This will retrieve a KV pair with the key `my_key` in the `company.team` namespace. The output of the API will contain the data type of the value and the retrieved value of the KV pair: +This `curl` command retrieves a KV pair with the key `my_key` in the `company.team` namespace. The output of the API contains the data type of the value and the retrieved value of the KV pair: ```json {"type": "STRING", "value": "Hello World"} @@ -345,13 +345,13 @@ You can list all keys in the namespace as follows: curl -X GET -H "Content-Type: application/json" http://localhost:8080/api/v1/namespaces/{namespace}/kv ``` -The `curl` command below will return all keys in the `company.team` namespace: +The `curl` command below returns all keys in the `company.team` namespace: ```bash curl -X GET -H "Content-Type: application/json" http://localhost:8080/api/v1/namespaces/company.team/kv ``` -The output will be returned as a JSON array of all keys in the namespace: +The output is returned as a JSON array of all keys in the namespace: ```json [ {"key":"my_key","creationDate":"2024-07-27T06:10:33.422Z","updateDate":"2024-07-27T06:11:08.911Z"}, @@ -369,13 +369,13 @@ curl -X DELETE -H "Content-Type: application/json" http://localhost:8080/api/v1/ This call returns a boolean indicating whether the key was deleted. -For example, the following `curl` command will return `false` because the key `non_existing_key` does not exist: +For example, the following `curl` command returns `false` because the key `non_existing_key` does not exist: ```bash curl -X DELETE -H "Content-Type: application/json" http://localhost:8080/api/v1/namespaces/company.team/kv/non_existing_key ``` -However, when we try to delete a key `my_key` which exists in the `company.team` namespace, the same API call will return `true`: +However, when we try to delete a key `my_key` which exists in the `company.team` namespace, the same API call returns `true`: ```bash curl -X DELETE -H "Content-Type: application/json" http://localhost:8080/api/v1/namespaces/company.team/kv/my_key @@ -389,7 +389,7 @@ curl -X DELETE -H "Content-Type: application/json" http://localhost:8080/api/v1/ You can create a KV pair via Terraform by using the `kestra_kv` resource. -Here is an example of how to create a KV pair: +Below is an example of how to create a KV pair: ```hcl resource "kestra_kv" "my_key" { @@ -404,7 +404,7 @@ resource "kestra_kv" "my_key" { You can read a KV pair via Terraform by using the `kestra_kv` data source. -Here is an example of how to read a KV pair: +Below is an example of how to read a KV pair: ```hcl data "kestra_kv" "new" { diff --git a/content/docs/05.concepts/06.pebble.md b/content/docs/05.concepts/06.pebble.md index d10c59819a..f6b537d027 100644 --- a/content/docs/05.concepts/06.pebble.md +++ b/content/docs/05.concepts/06.pebble.md @@ -5,7 +5,7 @@ icon: /docs/icons/concepts.svg Dynamically render variables, inputs and outputs. -Pebble is a Java templating engine inspired by [Twig](https://twig.symfony.com/) and similar to the [Python Jinja Template Engine](https://palletsprojects.com/p/jinja/) syntax. Kestra uses it to dynamically render variables, inputs and outputs within the execution context. +Pebble is a Java templating engine inspired by [Twig](https://twig.symfony.com/) and similar to the [Python Jinja Template Engine](https://palletsprojects.com/p/jinja/) syntax. Kestra uses it to dynamically render variables, inputs, and outputs within the execution context.
@@ -13,7 +13,7 @@ Pebble is a Java templating engine inspired by [Twig](https://twig.symfony.com/) ## Reading inputs -When using `inputs` property in a Flow, you can access the corresponding values just by using `inputs` variable in your tasks. +When using `inputs` property in a Flow, you can access the corresponding values by using `inputs` variable in your tasks. ```yaml id: input_string @@ -56,9 +56,9 @@ tasks: ## Dynamically render a task with `TemplatedTask` -Since Kestra 0.16.0, you can use the `TemplatedTask` task which allows you to fully template all task properties using Pebble. This way, all task properties and their values can be dynamically rendered based on your custom inputs, variables, and outputs from other tasks. +Since Kestra 0.16.0, you can use the `TemplatedTask` task to fully template all task properties using Pebble. This way, all task properties and their values can be dynamically rendered based on your custom inputs, variables, and outputs from other tasks. -Here is an example of how to use the [TemplatedTask](/plugins/tasks/templating/io.kestra.plugin.core.templating.TemplatedTask) to create a Databricks job using dynamic properties: +Below is an example of how to use the [TemplatedTask](/plugins/tasks/templating/io.kestra.plugin.core.templating.TemplatedTask) to create a Databricks job using dynamic properties: ```yaml id: templated_databricks_job @@ -114,8 +114,8 @@ For instance, we can use the `date` filter to format date values: `'{{ inputs.my Most of the time, a flow will be triggered automatically. Either on schedule or based on external events. It’s common to use the date of the execution to process the corresponding data and make the flow dependent on time. -With Pebble you can use the `trigger.date` to get the date of the executed trigger. -Still, sometimes you want to manually execute a flow. Then the `trigger.date` variable won’t work anymore. For this you can use the `execution.startDate` variable that returns the execution start date. +With Pebble, you can use the `trigger.date` to get the date of the executed trigger. +Still, sometimes you may want to manually execute a flow. In this case, the `trigger.date` variable won’t be suitable. In this scenario, you can use the `execution.startDate` variable that returns the execution start date. To support both use cases, use the coalesce operator `??`. The example below shows how to apply it in a flow. @@ -155,9 +155,9 @@ tasks: message: "{{ inputs.data }}" ``` -The expression `{{ inputs.data.value }}` will return the list `[1, 2, 3]` +The expression `{{ inputs.data.value }}` returns the list `[1, 2, 3]` -The expression `{{ inputs.data.value | jq(".[1]") | first }}` will return `2`. +The expression `{{ inputs.data.value | jq(".[1]") | first }}` returns `2`. `jq(".[1]")` accesses the second value of the list and returns an array with one element. We then use `first` to access the value itself. @@ -168,7 +168,7 @@ You can troubleshoot complex Pebble expressions using the Debug Outputs button i ## Using conditions in Pebble -In some tasks, such as the `If` or `Switch` tasks, you will need to provide some conditions. You can use the Pebble syntax to use previous task outputs within those conditions: +In some tasks, such as the `If` or `Switch` tasks, you need to provide some conditions. You can use the Pebble syntax to use previous task outputs within those conditions: ```yaml id: test-object diff --git a/content/docs/05.concepts/07.blueprints.md b/content/docs/05.concepts/07.blueprints.md index 8bc6d4b722..67368ac760 100644 --- a/content/docs/05.concepts/07.blueprints.md +++ b/content/docs/05.concepts/07.blueprints.md @@ -19,7 +19,7 @@ Each Blueprint combines code and documentation and can be assigned several tags All Blueprints are validated and documented. You can easily customize and integrate them into your new or existing flows with a single click on the "Use" button. -View the Blueprints library [here](/blueprints). +To see more, check out the [Blueprints library](/blueprints). ![Blueprint](/docs/user-interface-guide/blueprints.png) diff --git a/content/docs/05.concepts/08.backfill.md b/content/docs/05.concepts/08.backfill.md index cf857ee5f9..0985fa2c57 100644 --- a/content/docs/05.concepts/08.backfill.md +++ b/content/docs/05.concepts/08.backfill.md @@ -26,15 +26,15 @@ triggers: cron: "*/30 * * * *" ``` -This flow will run every 30 minutes. However, imagine that your source system had an outage for 5 hours. The flow will miss 10 executions. To replay these missed executions, you can use the backfill feature. +This flow run every 30 minutes. However, imagine that your source system had an outage for 5 hours. The flow will miss 10 executions. To replay these missed executions, you can use the backfill feature. ::alert{type="info"} **All missed schedules are automatically recovered by default.** -You can use Backfill if it's configured differently, e.g. to not recover missed schedules or only the most recent. Read more in the [dedicated documentation](../04.workflow-components/07.triggers/01.schedule-trigger.md#recover-missed-schedules). +You can use Backfill if it's configured differently, e.g., to not recover missed schedules or only the most recent. Read more in the [dedicated documentation](../04.workflow-components/07.triggers/01.schedule-trigger.md#recover-missed-schedules). :: -To backfill the missed executions, go to the `Triggers` tab on the Flow's detail page and click on the `Backfill executions` button. +To backfill the missed executions, go to the **Triggers** tab on the Flow's detail page and click on the **Backfill executions** button. ![backfill1](/docs/workflow-components/backfill1.png) @@ -44,7 +44,7 @@ You can then select the start and end date for the backfill. Additionally, you c :: -You can pause and resume the backfill process at any time, and by clicking on the `Details` button, you can see more details about that backfill process: +You can pause and resume the backfill process at any time, and by clicking on the **Details** button, you can see more details about that backfill process: ![backfill2](/docs/workflow-components/backfill2.png) @@ -52,7 +52,7 @@ You can pause and resume the backfill process at any time, and by clicking on th ### Using cURL -You can invoke the backfill exections using the cURL call as follows: +You can invoke the backfill exections using the `cURL` call as follows: ```sh curl -XPUT http://localhost:8080/api/v1/triggers -H 'Content-Type: application/json' -d '{ @@ -73,7 +73,7 @@ curl -XPUT http://localhost:8080/api/v1/triggers -H 'Content-Type: application/j }' ``` -In the `backfill` attribute, you need to provide the start time for the backfill. The end time can be optinally provided. You can provide inputs to the flow, if any. You can attach labels to the backfill executions by providing key-value pairs in the `labels` section. Other attributes to this PUT call are flowId, namespace and triggerId corresponding to the flow that is to backfilled. +In the `backfill` attribute, you need to provide the start time for the backfill. The end time can be optinally provided. You can provide inputs to the flow, if any. You can attach labels to the backfill executions by providing key-value pairs in the `labels` section. Other attributes to this PUT call are `flowId`, `namespace` and `triggerId` corresponding to the flow that is to backfilled. ### Using Python requests @@ -112,4 +112,4 @@ print(response.status_code) print(response.text) ``` -With this code, you will be invoking the backfill for `scheduled_flow` flow under `company.team` namespace based on `schedule` trigger ID within the flow. The number of backfills that will be executed will depend on the schedule present in the `schedule` trigger, and the `start` and `end` times mentioned in the backfill. When the `end` time is null, as in this case, the `end` time would be considered as the present time. +With this code, you will be invoking the backfill for `scheduled_flow` flow under `company.team` namespace based on `schedule` trigger ID within the flow. The number of backfills that will be executed will depend on the schedule present in the `schedule` trigger and the `start` and `end` times mentioned in the backfill. When the `end` time is null, as in this case, the `end` time would be considered as the present time. diff --git a/content/docs/05.concepts/10.replay.md b/content/docs/05.concepts/10.replay.md index 4f6baf0496..3996702330 100644 --- a/content/docs/05.concepts/10.replay.md +++ b/content/docs/05.concepts/10.replay.md @@ -77,7 +77,7 @@ tasks: When you run the above workflow, you should see an error in the `to_parquet` task. -From the logs, you will be able to see that the error is due to a misconfigured date format in the `datetimeFormat` field — in fact, the date format should have a full year, not just a two-digit year: `"yyyy-MM-dd' 'HH:mm:ss"`. +From the logs, you are able to see that the error is due to a misconfigured date format in the `datetimeFormat` field — in fact, the date format should have a full year, not just a two-digit year: `"yyyy-MM-dd' 'HH:mm:ss"`. You correct the error in the workflow code and save it. @@ -139,7 +139,7 @@ tasks: ``` :: -Now you can go to the previously failed Execution and click on the `to_parquet` task run to re-run it (either from the Gantt or from the Logs view). +Now, you can go to the previously failed Execution and click on the `to_parquet` task run to re-run it (either from the Gantt or from the Logs view). ![replay1](/docs/concepts/replay1.png) @@ -147,11 +147,11 @@ Now select the new revision of the flow code that contains the fix and confirm w ![replay2](/docs/concepts/replay2.png) -This will re-run the task with the new (corrected!) revision of the flow code. +This re-runs the task with the new (corrected!) revision of the flow code. ![replay3](/docs/concepts/replay3.png) -You can inspect the logs and verify that the task now completes successfully. The Attempt number will be incremented to show that this is a new run of the task. +You can inspect the logs and verify that the task now completes successfully. The Attempt number increments to show that this is a new run of the task. ![replay4](/docs/concepts/replay4.png) @@ -159,4 +159,4 @@ The Overview tab will additionally show the new Attempt number and the new revis ![replay5](/docs/concepts/replay5.png) -The replay feature allowed us to re-run a failed task with the corrected version of the flow code. You didn't have to rerun tasks that had already completed successfully. This is a huge time-saver when iterating on your workflows! ⚡️ +The replay feature allowed you to re-run a failed task with the corrected version of the flow code. You didn't have to rerun tasks that had already completed successfully. This is a huge time-saver when iterating on your workflows! ⚡️ diff --git a/content/docs/05.concepts/11.storage.md b/content/docs/05.concepts/11.storage.md index b1a0e04096..072ced0251 100644 --- a/content/docs/05.concepts/11.storage.md +++ b/content/docs/05.concepts/11.storage.md @@ -24,12 +24,12 @@ fetchType: FETCH ``` The `fetchType` property can have four values: -- `FETCH_ONE`: will fetch the first row and set it in a task output attribute (the `row` attribute for DynamoDB); the data will be stored inside the execution context. -- `FETCH`: will fetch all rows and set them in a task output attribute (the `rows` attribute for DynamoDB); the data will be stored inside the execution context. -- `STORE`: will store all rows inside Kestra's internal storage. The internal storage will return a URI usually set in the task output attribute `uri` and that can be used to retrieve the file from the internal storage. -- `NONE`: will do nothing. +- `FETCH_ONE`: fetches the first row and set it in a task output attribute (the `row` attribute for DynamoDB); the data is stored inside the execution context. +- `FETCH`: fetches all rows and set them in a task output attribute (the `rows` attribute for DynamoDB); the data is stored inside the execution context. +- `STORE`: stores all rows inside Kestra's internal storage. The internal storage returns a URI usually set in the task output attribute `uri` and that can be used to retrieve the file from the internal storage. +- `NONE`: does nothing. -The three `fetch`/`fetchOne`/`store` properties will do the same but using three different task properties instead of a single one. +The three `fetch`/`fetchOne`/`store` properties do the same but using three different task properties instead of a single one. ## Storing data @@ -39,7 +39,7 @@ Data can be stored as variables inside the flow execution context. This can be c To do so, tasks store data as [output attributes](../04.workflow-components/06.outputs.md) that are then available inside the flow via Pebble expressions like `{{outputs.taskName.attributeName}}`. -Be careful that when the size of the data is significant, this will increase the size of the flow execution context, which can lead to slow execution and increase the size of the execution storage inside Kestra's repository. +Be careful, the size of the data is significant, this increases the size of the flow execution context, which can lead to slow execution and increase the size of the execution storage inside Kestra's repository. ::alert{type="warning"} Depending on the Kestra internal queue and repository implementation, there can be a hard limit on the size of the flow execution context as it is stored as a single row/message. Usually, this limit is around 1MB, so this is important to avoid storing large amounts of data inside the flow execution context. @@ -83,7 +83,7 @@ Dedicated tasks allow managing the files stored inside the internal storage: ::alert{type="warning"} This should be the main method for storing and carrying large data from task to task. -As an example, if you know that a [HTTP Request](/plugins/plugin-fs/tasks/http/io.kestra.plugin.core.http.Request) will return a heavy payload, you should consider using [HTTP Download](/plugins/plugin-fs/tasks/http/io.kestra.plugin.core.http.Download) along with a [Serdes](/plugins/plugin-serdes) instead of carrying raw data in [Flow Execution Context](#storing-data-inside-the-flow-execution-context) +As an example, if you know that a [HTTP Request](/plugins/plugin-fs/tasks/http/io.kestra.plugin.core.http.Request) returns a heavy payload, you should consider using [HTTP Download](/plugins/plugin-fs/tasks/http/io.kestra.plugin.core.http.Download) along with a [Serdes](/plugins/plugin-serdes) instead of carrying raw data in [Flow Execution Context](#storing-data-inside-the-flow-execution-context) :: ### Storing data inside the KV store @@ -111,7 +111,7 @@ tasks: key: name ``` -In the next example, the flow will `Set`, `Get` and `Delete` the data: +In the next example, the flow uses `Set`, `Get` and `Delete` on the data: ::collapse{title="Example Flow"} @@ -156,14 +156,14 @@ tasks: key: user_name ``` -As we can see, when we `Set` a new value for `user_name`, we have to use another `Get` task to get the most up to date value, and then reference the `Get` task `id` in our log underneath to get the latest value. Same applies to the `Delete` task too. In order to show that it has been deleted, we try to get the data from the key deleetd in the `delete_data` task to show that. +When we `Set` a new value for `user_name`, we have to use another `Get` task to get the most up to date value, and then reference the `Get` task `id` in our log underneath to get the latest value. The same applies to the `Delete` task. In order to show that it has been deleted, we try to get the data from the key deleetd in the `delete_data` task to show that. :: ## Processing data For basic data processing, you can leverage Kestra's [Pebble templating engine](../expressions/index.md). -For more complex data transformations, Kestra offers various data processing plugins incl. transform tasks or custom scripts. +For more complex data transformations, Kestra offers various data processing plugins including transform tasks or custom scripts. ### Converting files @@ -198,14 +198,14 @@ tasks: Kestra can launch scripts written in Python, R, Node.js, Shell and Powershell. Depending on the `runner`, they can run directly in a local process on the host or inside Docker containers. -Those script tasks are available in the [Scripts Plugin](https://github.com/kestra-io/plugin-scripts). Here is documentation for each of them: -- The [Python](/plugins/plugin-script-python/tasks/io.kestra.plugin.scripts.python.script) task will run a Python script in a Docker container or in a local process. -- The [Node](/plugins/plugin-script-node/tasks/io.kestra.plugin.scripts.node.script) task will run a Node.js script in a Docker container or in a local process. -- The [R](/plugins/plugin-script-r/tasks/io.kestra.plugin.scripts.r.script) task will run an R script in a Docker container or in a local process. -- The [Shell](/plugins/plugin-script-shell/tasks/io.kestra.plugin.scripts.shell.script) task will execute a single Shell command, or a list of commands that you provide. -- The [PowerShell](/plugins/plugin-script-powershell/tasks/io.kestra.plugin.scripts.powershell.script) task will execute a single PowerShell command, or a list of commands that you provide. +Those script tasks are available in the [Scripts Plugin](https://github.com/kestra-io/plugin-scripts). Below is documentation for each of them: +- The [Python](/plugins/plugin-script-python/tasks/io.kestra.plugin.scripts.python.script) task runs a Python script in a Docker container or in a local process. +- The [Node](/plugins/plugin-script-node/tasks/io.kestra.plugin.scripts.node.script) task runs a Node.js script in a Docker container or in a local process. +- The [R](/plugins/plugin-script-r/tasks/io.kestra.plugin.scripts.r.script) task runs an R script in a Docker container or in a local process. +- The [Shell](/plugins/plugin-script-shell/tasks/io.kestra.plugin.scripts.shell.script) task executes a single Shell command, or a list of commands that you provide. +- The [PowerShell](/plugins/plugin-script-powershell/tasks/io.kestra.plugin.scripts.powershell.script) task executes a single PowerShell command, or a list of commands that you provide. -The following example will query the BigQuery public dataset with Wikipedia page views to find the top 10 pages, convert it to CSV, and use the CSV file inside a Python task for further transformations using Pandas. +The following example queries the BigQuery public dataset with Wikipedia page views to find the top 10 pages, convert it to CSV, and use the CSV file inside a Python task for further transformations using Pandas. ```yaml id: wikipedia-top-ten-python-panda @@ -246,22 +246,22 @@ tasks: Kestra.outputs({'views': int(views)}) ``` -Kestra offers several plugins for ingesting and transforming data — check [the Plugin list](/plugins) for more details. +Kestra offers several plugins for ingesting and transforming data — check [the Plugin list](/plugins) for more details. Make sure to also check: -1. The [Script documentation](../04.workflow-components/01.tasks/02.scripts/index.md) for a detailed overview of how to work with Python, R, Node.js, Shell and Powershell scripts and how to integrate them with Git and Docker. -2. The [Blueprints](/blueprints) catalog — simply search for the relevant language (e.g. Python, R, Rust) or use case (*ETL, Git, dbt, etc.*) to find the relevant examples. +1. The [Script documentation](../04.workflow-components/01.tasks/02.scripts/index.md) for a detailed overview of how to work with Python, R, Node.js, Shell and Powershell scripts, and how to integrate them with Git and Docker. +2. The [Blueprints](/blueprints) catalog — simply search for the relevant language (e.g., Python, R, Rust) or use case (*ETL, Git, dbt, etc.*) to find the relevant examples. ### Processing data using file transform -Kestra can process data **row by row** using file transform tasks. The transformation will be done with a small script written in Python, JavaScript, or Groovy. +Kestra can process data **row by row** using file transform tasks. The transformation is done with a small script written in Python, JavaScript, or Groovy. - The [Jython FileTransform](/plugins/plugin-script-jython/tasks/io.kestra.plugin.scripts.jython.FileTransform) task allows transforming rows with Python. - The [Nashorn FileTransform](/plugins/plugin-script-nashorn/tasks/io.kestra.plugin.scripts.nashorn.FileTransform) task allows transforming rows with JavaScript. - The [Groovy FileTransform](/plugins/plugin-script-groovy/tasks/io.kestra.plugin.scripts.groovy.FileTransform) task allows transforming rows with Groovy. -The following example will query the BigQuery public dataset for Wikipedia pages, convert it row by row with the Nashorn FileTransform, and write it in a CSV file. +The following example queries the BigQuery public dataset for Wikipedia pages, convert it row by row with the Nashorn FileTransform, and write it in a CSV file. ```yaml id: wikipedia-top-ten-file-transform @@ -319,10 +319,10 @@ tasks: type: "io.kestra.plugin.core.storage.PurgeExecution" ``` -The execution context itself will not be available after the end of the execution and will be automatically deleted from Kestra's repository after a retention period (by default, seven days) that can be changed; see [configurations](../configuration/index.md). +The execution context itself is not available after the end of the execution and is automatically deleted from Kestra's repository after a retention period (seven days by default) that can be changed; see [configurations](../configuration/index.md). -Also, the [Purge](/plugins/core/tasks/storages/io.kestra.plugin.core.storage.Purge) task can be used to purge storages, logs, executions of previous execution. For example, this flow will purge all of these every day: +Also, the [Purge](/plugins/core/tasks/storages/io.kestra.plugin.core.storage.Purge) task can be used to purge storages, logs, and executions of previous execution. For example, this flow purges all of these every day: ```yaml id: purge namespace: company.team @@ -377,7 +377,7 @@ tasks: When using the `ForEachItem` task, you can use the `read()` function to read the content of a file as a string. This is especially useful when you want to pass the content of a file as a raw string as an input to a subflow. -Here is a simple subflow example that uses a string input: +Below is a simple subflow example that uses a string input: ```yaml id: subflow_raw_string_input @@ -394,7 +394,7 @@ tasks: format: "{{ inputs.string_input }}" ``` -Because the `ForEachItem` task splits the `items` file into batches of smaller files (by default, one file per row), you can use the `read()` function to read the content of that file for a given batch as a string value and pass it as an input to that subflow shown above. +Because the `ForEachItem` task splits the `items` file into batches of smaller files (one file per row by default), you can use the `read()` function to read the content of that file for a given batch as a string value and pass it as an input to that subflow shown above. ```yaml id: parent_flow @@ -428,7 +428,7 @@ So far, you've seen how to read a file from the internal storage as a string. Ho The `read()` function takes the absolute path to the file you want to read. The path must point to a file stored in the **same namespace** as the flow you are executing. -Here is a simple example showing how you can read a file named `hello.py` stored in the `scripts` directory of the `company.team` namespace: +Below is a simple example showing how you can read a file named `hello.py` stored in the `scripts` directory of the `company.team` namespace: ```yaml id: hello @@ -450,13 +450,13 @@ You can use the Pebble function `{{ fromJson(myvar) }}` and a `{{ myvar | toJson ::collapse{title="The fromJson() function"} -The function is used to convert a string to a JSON object. For example, the following Pebble expression will convert the string `{"foo": [666, 1, 2]}` to a JSON object and then return the first value of the `foo` key, which is `42`: +The function is used to convert a string to a JSON object. For example, the following Pebble expression converts the string `{"foo": [666, 1, 2]}` to a JSON object and then returns the first value of the `foo` key, which is `42`: ```yaml {{ json('{"foo": [42, 43, 44]}').foo[0] }} ``` -You can use the `read()` function to read the content of a file as a string and then apply the `json()` function to convert it to a JSON object. Afterwards, you can read the value of a specific key in that JSON object. For example, the following Pebble expression will read the content of a file named `my.json` and then return the value of the `foo` key, which is `42`: +You can use the `read()` function to read the content of a file as a string and then apply the `json()` function to convert it to a JSON object. Afterwards, you can read the value of a specific key in that JSON object. For example, the following Pebble expression reads the content of a file named `my.json` and then returns the value of the `foo` key, which is `42`: ```yaml id: extract_json @@ -479,7 +479,7 @@ tasks: message: "{{ json(read(outputs.extract.uri)) | jq('map(.detail | fromjson | .message)') | first }}" ``` -The above flow will download a JSON file via an HTTP Request, read its content as a string, convert it to a JSON object, and then in another task, it will parse the JSON object and return the value of a nested key. +The above flow downloads a JSON file via an HTTP Request, reads its content as a string, converts it to a JSON object, and then in another task, it parses the JSON object and returns the value of a nested key. :: ::collapse{title="The json filter"} diff --git a/content/docs/05.concepts/12.caching.md b/content/docs/05.concepts/12.caching.md index 9c5e705d44..15300c90ff 100644 --- a/content/docs/05.concepts/12.caching.md +++ b/content/docs/05.concepts/12.caching.md @@ -41,7 +41,7 @@ Kestra packages the files that need to be cached and stores them in the internal ### Node.js example -Here's an example of a flow that installs the `colors` package before running a Node.js script. The `node_modules` folder is cached for one hour. +Below is an example of a flow that installs the `colors` package before running a Node.js script. The `node_modules` folder is cached for one hour. ```yaml id: node_cached_dependencies @@ -66,7 +66,7 @@ tasks: ### Python example -Here's an example of a flow that installs the `pandas` package before running a Python script. The `venv` folder is cached for one day. +Below is an example of a flow that installs the `pandas` package before running a Python script. The `venv` folder is cached for one day. ```yaml id: python_cached_dependencies @@ -96,14 +96,14 @@ tasks: ### How to invalidate the cache -Here's how that works: +Below are the details how to invalidate the cache: - After the first run, the files are cached - The next time the task is executed: - - if the `ttl` didn't pass, the files are retrieved from cache - - If the `ttl` passed, the cache is invalidated and no files will be retrieved from cache; because cache is no longer present, the `npm install` command from the `beforeCommands` property will take a bit longer to execute + - If the `ttl` didn't pass, then the files are retrieved from cache. + - If the `ttl` passed, then the cache is invalidated and no files will be retrieved from cache; because cache is no longer present, the `npm install` command from the `beforeCommands` property will take a bit longer to execute. - If you edit the task and change the `ttl` to: - - a longer duration e.g. `PT5H` — the files will be cached for five hours using the new `ttl` duration - - a shorter duration e.g. `PT5M` — the cache will be invalidated after five minutes using the new `ttl` duration. + - a longer duration e.g., `PT5H` — the files will be cached for five hours using the new `ttl` duration + - a shorter duration e.g., `PT5M` — the cache will be invalidated after five minutes using the new `ttl` duration. The `ttl` is evaluated at runtime. If the most recently set `ttl` duration has passed as compared to the last task run execution date, the cache is invalidated and the files are no longer retrieved from cache. diff --git a/content/docs/05.concepts/system-flows.md b/content/docs/05.concepts/system-flows.md index 7f474a998b..45ed8c2ea4 100644 --- a/content/docs/05.concepts/system-flows.md +++ b/content/docs/05.concepts/system-flows.md @@ -13,7 +13,7 @@ Automate maintenance workflows with System Flows. --- -System Flows periodically execute background operations that keep your platform running, but which you would generally prefer to keep out of sight. These flows automate maintenance workflows, such as: +System Flows periodically execute background operations that keep your platform running but which you would generally prefer to keep out of sight. These flows automate maintenance workflows, such as: 1. Sending [alert notifications](https://kestra.io/blueprints/failure-alert-slack) 2. Creating automated support tickets when critical workflows fail @@ -29,11 +29,11 @@ kestra: namespace: system ``` -To access System Flows, navigate to the `Namespaces` section in the UI. The `system` namespace is pinned at the top for quick access. +To access System Flows, navigate to the **Namespaces** section in the UI. The `system` namespace is pinned at the top for quick access. ![system_namespace](/docs/concepts/system-flows/system_namespace.png) -Here, you’ll find the _System Blueprints_ tab, which provides fully customizable templates which you can modify to suit your organization’s needs. +From this section, you’ll find the _System Blueprints_ tab, which provides fully customizable templates that you can modify to suit your organization’s needs. ![system_blueprints](/docs/concepts/system-flows/system_blueprints.png) @@ -41,7 +41,7 @@ Here, you’ll find the _System Blueprints_ tab, which provides fully customizab Keep in mind that System Flows are not restricted to System Blueprints — any valid Kestra flow can become a System Flow if it's added to the `system` namespace. :: -System Flows are intentionally hidden from the main UI, appearing only in the `system` namespace. The Dashboard, Flows, and Executions pages offer a multi-select filter with options for `User` (default) and `System` (visible by default only within the `system` namespace). This makes it easy to toggle between user-facing workflows and background system flows and their executions, or view both simultaneously. +System Flows are intentionally hidden from the main UI, appearing only in the `system` namespace. The Dashboard, Flows, and Executions pages offer a multi-select filter with options for `User` (default) and `System` (visible by default only within the `system` namespace). This makes it easy to toggle between user-facing workflows and background system flows and their executions or view both simultaneously. ![system_filter](/docs/concepts/system-flows/system_filter.png) diff --git a/content/docs/05.concepts/system-labels.md b/content/docs/05.concepts/system-labels.md index c274836ca4..2c0f1ad0ad 100644 --- a/content/docs/05.concepts/system-labels.md +++ b/content/docs/05.concepts/system-labels.md @@ -38,11 +38,11 @@ System Labels are labels prefixed with `system.` that serve specific purposes. B ### `system.correlationId` -- Automatically set for every execution and propagated to downstream executions created by `Subflow` or `ForEachItem` tasks. -- Represents the ID of the first execution in a chain of executions, enabling tracking of execution lineage. +- Automatically set for every execution and propagated to downstream executions created by `Subflow` or `ForEachItem` tasks +- Represents the ID of the first execution in a chain of executions, enabling tracking of execution lineage - Use this label to filter all executions originating from a specific parent execution. -For example, if a parent flow triggers multiple subflows, filtering by the parent's `system.correlationId` will display all related executions. +For example, if a parent flow triggers multiple subflows, filtering by the parent's `system.correlationId` displays all related executions. **Note:** The Execution API supports setting this label at execution creation but not modification. @@ -50,15 +50,15 @@ For example, if a parent flow triggers multiple subflows, filtering by the paren ### `system.username` -- Automatically set for every execution and contains the username of the user who triggered the execution. -- Useful for auditing and identifying who initiated specific executions. +- Automatically set for every execution and contains the username of the user who triggered the execution +- Useful for auditing and identifying who initiated specific executions --- ### `system.readOnly` -- Used to mark a flow as read-only, disabling the flow editor in the UI. -- Helps prevent modifications to critical workflows, such as production flows managed through CI/CD pipelines. +- Used to mark a flow as read-only, disabling the flow editor in the UI +- Helps prevent modifications to critical workflows, such as production flows managed through CI/CD pipelines **Example:**