Skip to content

Latest commit

 

History

History
153 lines (126 loc) · 3.95 KB

regex_parser.md

File metadata and controls

153 lines (126 loc) · 3.95 KB

regex_parser operator

The regex_parser operator parses the string-type field selected by parse_from with the given regular expression pattern.

Configuration Fields

Field Default Description
id regex_parser A unique identifier for the operator
output Next in pipeline The connected operator(s) that will receive all outbound entries
regex required A Go regular expression. The named capture groups will be extracted as fields in the parsed object
parse_from $ A field that indicates the field to be parsed
parse_to $ A field that indicates the field to be parsed
preserve false Preserve the unparsed value on the record
on_error send The behavior of the operator if it encounters an error. See on_error
timestamp nil An optional timestamp block which will parse a timestamp field before passing the entry to the output operator
severity nil An optional severity block which will parse a severity field before passing the entry to the output operator

Example Configurations

Parse the field message with a regular expression

Configuration:

- type: regex_parser
  parse_from: message
  regexp: '^Host=(?P<host>[^,]+), Type=(?P<type>.*)$'
Input record Output record
{
  "timestamp": "",
  "record": {
    "message": "Host=127.0.0.1, Type=HTTP"
  }
}
{
  "timestamp": "",
  "record": {
    "host": "127.0.0.1",
    "type": "HTTP"
  }
}

Parse a nested field to a different field, preserving original

Configuration:

- type: regex_parser
  parse_from: message.embedded
  parse_to: parsed
  regexp: '^Host=(?P<host>[^,]+), Type=(?P<type>.*)$'
  preserve: true
Input record Output record
{
  "timestamp": "",
  "record": {
    "message": {
      "embedded": "Host=127.0.0.1, Type=HTTP"
    }
  }
}
{
  "timestamp": "",
  "record": {
    "message": {
      "embedded": "Host=127.0.0.1, Type=HTTP"
    },
    "parsed": {
      "host": "127.0.0.1",
      "type": "HTTP"
    }
  }
}

Parse the field message with a regular expression and also parse the timestamp

Configuration:

- type: regex_parser
  regexp: '^Time=(?P<timestamp_field>\d{4}-\d{2}-\d{2}), Host=(?P<host>[^,]+), Type=(?P<type>.*)$'
  timestamp:
    parse_from: timestamp_field
    layout_type: strptime
    layout: '%Y-%m-%d'
Input record Output record
{
  "timestamp": "",
  "record": {
    "message": "Time=2020-01-31, Host=127.0.0.1, Type=HTTP"
  }
}
{
  "timestamp": "2020-01-31T00:00:00-00:00",
  "record": {
    "host": "127.0.0.1",
    "type": "HTTP"
  }
}