Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add foreach processor documentation #5981

Merged
merged 12 commits into from
May 28, 2024
Prev Previous commit
Next Next commit
Add pipeline examples
Signed-off-by: Melissa Vagi <vagimeli@amazon.com>
vagimeli committed May 22, 2024
commit 55fe58f1e093614ad0fd555193479bd64bcf0581
123 changes: 85 additions & 38 deletions _ingest-pipelines/processors/foreach.md
Original file line number Diff line number Diff line change
@@ -5,7 +5,7 @@
nav_order: 110
---

# Foreach processor

Check failure on line 8 in _ingest-pipelines/processors/foreach.md

GitHub Actions / vale

[vale] _ingest-pipelines/processors/foreach.md#L8

[OpenSearch.Spelling] Error: Foreach. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: Foreach. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_ingest-pipelines/processors/foreach.md", "range": {"start": {"line": 8, "column": 3}}}, "severity": "ERROR"}

The `foreach` processor is used to iterate over a list of values in an input document and perform some operation on each value. This can be useful for tasks like extracting information from a nested JSON structure or applying transformations to a collection of fields.

@@ -49,42 +49,22 @@
The following query creates a pipeline named `test-foreach` that uses the `foreach` processor to extract information from a nested JSON structure:

```json
PUT _ingest/pipeline/example-foreach
PUT _ingest/pipeline/test-foreach
{
"version": 2,
"example-foreach": {
"source": {
"http": {
"path": "/data"
}
},
"processors": [
{
"foreach": {
"field": "data.orders",
"processor": {
"rename": {
"field": "_ingest._value.order_id",
"target_field": "order_id"
}
},
"on_failure": [
{
"set": {
"field": "_index",
"value": "failed-orders"
}
}
]
"description": "Extracts nested JSON data",
"processors": [
{
"foreach": {
"field": "users",
"processor": {
"json": {
"field": "_ingest._value",
"target_field": "user_data"
}
}
}
],
"sink": {
"opensearch": {
"index": "orders"
}
}
}
]
}
```
{% include copy-curl.html %}
@@ -106,33 +86,100 @@
The following example response confirms that the pipeline is working as expected:

```json
<insert response example>
{
"docs": [
{
"doc": {
"_index": "_index",
"_id": "_id",
"_source": {
"user_data": {
"name": "Jane Smith",
"age": 28
},
"users": [
"""{"name":"John Doe","age":32}""",
"""{"name":"Jane Smith","age":28}"""
]
},
"_ingest": {
"_value": null,
"timestamp": "2024-05-22T18:27:27.299741001Z"
}
}
}
]
}
```
{% include copy-curl.html %}

### Step 3: Ingest a document

The following query ingests a document into an index named `testindex1`:

```json
<insert code example>
PUT testindex1/_doc/1?pipeline=test-foreach
{
"users": [
"{\"name\":\"John Doe\",\"age\":32}",
"{\"name\":\"Jane Smith\",\"age\":28}"
]
}
```
{% include copy-curl.html %}

#### Response

The request indexes the document into the index <index name> and will index all documents with <what does this response tell the user?>.
The request indexes the document into the index `testindex1` and indexes all documents with the extracted JSON data from the `users` field:

```json
<insert code example>
{
"_index": "testindex1",
"_id": "1",
"_version": 2,
"result": "updated",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1
}
```
{% include copy-curl.html %}

### Step 4 (Optional): Retrieve the document

To retrieve the document, run the following query:

```json
<insert code example>
GET testindex1/_doc/1
```
{% include copy-curl.html %}

<Provide any other information and code examples relevant to the user or use cases.>
#### Response

The response shows the document with the extracted JSON data from the `users` field:

```json
{
"_index": "testindex1",
"_id": "1",
"_version": 2,
"_seq_no": 1,
"_primary_term": 1,
"found": true,
"_source": {
"user_data": {
"name": "Jane Smith",
"age": 28
},
"users": [
"""{"name":"John Doe","age":32}""",
"""{"name":"Jane Smith","age":28}"""
]
}
}
```
{% include copy-curl.html %}