Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/stanza] Add container operator parser #32594

Merged
merged 1 commit into from
May 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .chloggen/add_container_parser.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: filelogreceiver

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Add container operator parser

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [31959]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
1 change: 1 addition & 0 deletions pkg/stanza/adapter/register.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ package adapter // import "github.com/open-telemetry/opentelemetry-collector-con
import (
_ "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/output/file" // Register parsers and transformers for stanza-based log receivers
_ "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/output/stdout"
_ "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/container"
_ "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/csv"
_ "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/json"
_ "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/jsonarray"
Expand Down
238 changes: 238 additions & 0 deletions pkg/stanza/docs/operators/container.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
## `container` operator

The `container` operator parses logs in `docker`, `cri-o` and `containerd` formats.

### Configuration Fields

| Field | Default | Description |
|------------------------------|------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `id` | `container` | A unique identifier for the operator. |
| `format` | `` | The container log format to use if it is known. Users can choose between `docker`, `crio` and `containerd`. If not set, the format will be automatically detected. |
| `add_metadata_from_filepath` | `true` | Set if k8s metadata should be added from the file path. Requires the `log.file.path` field to be present. |
| `output` | Next in pipeline | The connected operator(s) that will receive all outbound entries. |
| `parse_from` | `body` | The [field](../types/field.md) from which the value will be parsed. |
| `parse_to` | `attributes` | The [field](../types/field.md) to which the value will be parsed. |
| `on_error` | `send` | The behavior of the operator if it encounters an error. See [on_error](../types/on_error.md). |
| `if` | | An [expression](../types/expression.md) that, when set, will be evaluated to determine whether this operator should be used for the given entry. This allows you to do easy conditional parsing without branching logic with routers. |
| `severity` | `nil` | An optional [severity](../types/severity.md) block which will parse a severity field before passing the entry to the output operator. |


### Embedded Operations

The `container` parser can be configured to embed certain operations such as the severity parsing. For more information, see [complex parsers](../types/parsers.md#complex-parsers).

### Add metadata from file path

Requires `include_file_path: true` in order for the `log.file.path` field to be available for the operator.
If that's not possible, users can disable the metadata addition with `add_metadata_from_filepath: false`.
A file path like `"/var/log/pods/some-ns_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"`,
will produce the following k8s metadata:

```json
{
"attributes": {
"k8s": {
"container": {
"name": "kube-controller",
"restart_count": "1"
}, "pod": {
"uid": "49cc7c1fd3702c40b2686ea7486091d6",
"name": "kube-controller-kind-control-plane"
}, "namespace": {
"name": "some-ns"
}
}
}
}
```

### Example Configurations:

#### Parse the body as docker container log

Configuration:
```yaml
- type: container
format: docker
add_metadata_from_filepath: true
```

Note: in this example the `format: docker` is optional since formats can be automatically detected as well.
`add_metadata_from_filepath` is true by default as well.

<table>
<tr><td> Input body </td> <td> Output body</td></tr>
<tr>
<td>

```json
{
"timestamp": "",
"body": "{\"log\":\"INFO: log line here\",\"stream\":\"stdout\",\"time\":\"2029-03-30T08:31:20.545192187Z\"}",
"log.file.path": "/var/log/pods/some_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"
}
```

</td>
<td>

```json
{
"timestamp": "2024-03-30 08:31:20.545192187 +0000 UTC",
"body": "log line here",
"attributes": {
"time": "2024-03-30T08:31:20.545192187Z",
"log.iostream": "stdout",
"k8s.pod.name": "kube-controller-kind-control-plane",
"k8s.pod.uid": "49cc7c1fd3702c40b2686ea7486091d6",
"k8s.container.name": "kube-controller",
"k8s.container.restart_count": "1",
"k8s.namespace.name": "some",
"log.file.path": "/var/log/pods/some_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"
}
}
```

</td>
</tr>
</table>

#### Parse the body as cri-o container log

Configuration:
```yaml
- type: container
```

<table>
<tr><td> Input body </td> <td> Output body</td></tr>
<tr>
<td>

```json
{
"timestamp": "",
"body": "2024-04-13T07:59:37.505201169-05:00 stdout F standalone crio line which is awesome",
"log.file.path": "/var/log/pods/some_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"
}
```

</td>
<td>

```json
{
"timestamp": "2024-04-13 12:59:37.505201169 +0000 UTC",
"body": "standalone crio line which is awesome",
"attributes": {
"time": "2024-04-13T07:59:37.505201169-05:00",
"logtag": "F",
"log.iostream": "stdout",
"k8s.pod.name": "kube-controller-kind-control-plane",
"k8s.pod.uid": "49cc7c1fd3702c40b2686ea7486091d6",
"k8s.container.name": "kube-controller",
"k8s.container.restart_count": "1",
"k8s.namespace.name": "some",
"log.file.path": "/var/log/pods/some_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"
}
}
```

</td>
</tr>
</table>

#### Parse the body as containerd container log

Configuration:
```yaml
- type: container
```

<table>
<tr><td> Input body </td> <td> Output body</td></tr>
<tr>
<td>

```json
{
"timestamp": "",
"body": "2023-06-22T10:27:25.813799277Z stdout F standalone containerd line that is super awesome",
"log.file.path": "/var/log/pods/some_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"
}
```

</td>
<td>

```json
{
"timestamp": "2023-06-22 10:27:25.813799277 +0000 UTC",
"body": "standalone containerd line that is super awesome",
"attributes": {
"time": "2023-06-22T10:27:25.813799277Z",
"logtag": "F",
"log.iostream": "stdout",
"k8s.pod.name": "kube-controller-kind-control-plane",
"k8s.pod.uid": "49cc7c1fd3702c40b2686ea7486091d6",
"k8s.container.name": "kube-controller",
"k8s.container.restart_count": "1",
"k8s.namespace.name": "some",
"log.file.path": "/var/log/pods/some_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"
}
}
```

</td>
</tr>
</table>

#### Parse the multiline as containerd container log and recombine into a single one

Configuration:
```yaml
- type: container
```

<table>
<tr><td> Input body </td> <td> Output body</td></tr>
<tr>
<td>

```json
{
"timestamp": "",
"body": "2023-06-22T10:27:25.813799277Z stdout P multiline containerd line that i",
"log.file.path": "/var/log/pods/some_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"
},
{
"timestamp": "",
"body": "2023-06-22T10:27:25.813799277Z stdout F s super awesomne",
"log.file.path": "/var/log/pods/some_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"
}
```

</td>
<td>

```json
{
"timestamp": "2023-06-22 10:27:25.813799277 +0000 UTC",
"body": "multiline containerd line that is super awesome",
"attributes": {
"time": "2023-06-22T10:27:25.813799277Z",
"logtag": "F",
"log.iostream": "stdout",
"k8s.pod.name": "kube-controller-kind-control-plane",
"k8s.pod.uid": "49cc7c1fd3702c40b2686ea7486091d6",
"k8s.container.name": "kube-controller",
"k8s.container.restart_count": "1",
"k8s.namespace.name": "some",
"log.file.path": "/var/log/pods/some_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"
}
}
```

</td>
</tr>
</table>
28 changes: 28 additions & 0 deletions pkg/stanza/operator/helper/regexp.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package helper // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/helper"

import (
"fmt"
"regexp"
)

func MatchValues(value string, regexp *regexp.Regexp) (map[string]any, error) {
matches := regexp.FindStringSubmatch(value)
if matches == nil {
return nil, fmt.Errorf("regex pattern does not match")
}

parsedValues := map[string]any{}
for i, subexp := range regexp.SubexpNames() {
if i == 0 {
// Skip whole match
continue
}
if subexp != "" {
parsedValues[subexp] = matches[i]
}
}
return parsedValues, nil
}
Loading