Skip to content

Latest commit

 

History

History
286 lines (235 loc) · 6.47 KB

regex_parser.md

File metadata and controls

286 lines (235 loc) · 6.47 KB

regex_parser operator

The regex_parser operator parses the string-type field selected by parse_from with the given regular expression pattern.

Configuration Fields

Field Default Description
id regex_parser A unique identifier for the operator
output Next in pipeline The connected operator(s) that will receive all outbound entries
regex required A Go regular expression. The named capture groups will be extracted as fields in the parsed object
parse_from $ A field that indicates the field to be parsed
parse_to $ A field that indicates the field to be parsed
preserve_to Preserves the unparsed value at the specified field
on_error send The behavior of the operator if it encounters an error. See on_error
if An expression that, when set, will be evaluated to determine whether this operator should be used for the given entry. This allows you to do easy conditional parsing without branching logic with routers.
timestamp nil An optional timestamp block which will parse a timestamp field before passing the entry to the output operator
severity nil An optional severity block which will parse a severity field before passing the entry to the output operator

Example Configurations

Parse the field message with a regular expression

Configuration:

- type: regex_parser
  parse_from: message
  regex: '^Host=(?P<host>[^,]+), Type=(?P<type>.*)$'
Input record Output record
{
  "timestamp": "",
  "record": {
    "message": "Host=127.0.0.1, Type=HTTP"
  }
}
{
  "timestamp": "",
  "record": {
    "host": "127.0.0.1",
    "type": "HTTP"
  }
}

Parse a nested field to a different field, preserving original

Configuration:

- type: regex_parser
  parse_from: message.embedded
  parse_to: parsed
  regex: '^Host=(?P<host>[^,]+), Type=(?P<type>.*)$'
  preserve: true
Input record Output record
{
  "timestamp": "",
  "record": {
    "message": {
      "embedded": "Host=127.0.0.1, Type=HTTP"
    }
  }
}
{
  "timestamp": "",
  "record": {
    "message": {
      "embedded": "Host=127.0.0.1, Type=HTTP"
    },
    "parsed": {
      "host": "127.0.0.1",
      "type": "HTTP"
    }
  }
}

Parse the field message with a regular expression and also parse the timestamp

Configuration:

- type: regex_parser
  regex: '^Time=(?P<timestamp_field>\d{4}-\d{2}-\d{2}), Host=(?P<host>[^,]+), Type=(?P<type>.*)$'
  timestamp:
    parse_from: timestamp_field
    layout_type: strptime
    layout: '%Y-%m-%d'
Input record Output record
{
  "timestamp": "",
  "record": {
    "message": "Time=2020-01-31, Host=127.0.0.1, Type=HTTP"
  }
}
{
  "timestamp": "2020-01-31T00:00:00-00:00",
  "record": {
    "host": "127.0.0.1",
    "type": "HTTP"
  }
}

Parse the message field only if "type" is "hostname"

Configuration:

- type: regex_parser
  regex: '^Host=(?<host>)$'
  if: '$record.type == "hostname"'
Input record Output record
{
  "record": {
    "message": "Host=testhost",
    "type": "hostname"
  }
}
{
  "record": {
    "host": "testhost",
    "type": "hostname"
  }
}
{
  "record": {
    "message": "Key=value",
    "type": "keypair"
  }
}
{
  "record": {
    "message": "Key=value",
    "type": "keypair"
  }
}

Parse the message field only if "type" is "hostname"

Configuration:

- type: regex_parser
  regex: '^Host=(?<host>)$'
  if: '$record.type == "hostname"'
Input record Output record
{
  "record": {
    "message": "Host=testhost",
    "type": "hostname"
  }
}
{
  "record": {
    "host": "testhost",
    "type": "hostname"
  }
}
{
  "record": {
    "message": "Key=value",
    "type": "keypair"
  }
}
{
  "record": {
    "message": "Key=value",
    "type": "keypair"
  }
}