Skip to content

Latest commit

 

History

History
262 lines (213 loc) · 6.18 KB

csv_parser.md

File metadata and controls

262 lines (213 loc) · 6.18 KB

csv_parser operator

The csv_parser operator parses the string-type field selected by parse_from with the given header values.

Configuration Fields

Field Default Description
id csv_parser A unique identifier for the operator
output Next in pipeline The connected operator(s) that will receive all outbound entries
header required when header_label not set A string of delimited field names
header_label required when header not set A label name to read the header field from, to support dynamic field names
header_delimiter value of delimiter A character that will be used as a delimiter for the header. Values \r and \n cannot be used as a delimiter
delimiter , A character that will be used as a delimiter. Values \r and \n cannot be used as a delimiter
lazy_quotes false If true, a quote may appear in an unquoted field and a non-doubled quote may appear in a quoted field.
parse_from $ A field that indicates the field to be parsed
parse_to $ A field that indicates the field to be parsed
preserve_to Preserves the unparsed value at the specified field
on_error send The behavior of the operator if it encounters an error. See on_error
timestamp nil An optional timestamp block which will parse a timestamp field before passing the entry to the output operator
severity nil An optional severity block which will parse a severity field before passing the entry to the output operator

Example Configurations

Parse the field message with a csv parser

Configuration:

- type: csv_parser
  parse_from: message
  header: 'id,severity,message'
Input record Output record
{
  "timestamp": "",
  "record": {
    "message": "1,debug,\"\"Debug Message\"\""
  }
}
{
  "timestamp": "",
  "record": {
    "id": "1",
    "severity": "debug",
    "message": "\"Debug Message\""
  }
}

Parse the field message with a csv parser using tab delimiter

Configuration:

- type: csv_parser
  parse_from: message
  header: 'id,severity,message'
  header_delimiter: ","
  delimiter: "\t"
Input record Output record
{
  "timestamp": "",
  "record": {
    "message": "1 debug \"Debug Message\""
  }
}
{
  "timestamp": "",
  "record": {
    "id": "1",
    "severity": "debug",
    "message": "\"Debug Message\""
  }
}

Parse the field message with csv parser and also parse the timestamp

Configuration:

- type: csv_parser
  header: 'timestamp_field,severity,message'
  timestamp:
    parse_from: timestamp_field
    layout_type: strptime
    layout: '%Y-%m-%d'
Input record Output record
{
  "timestamp": "",
  "record": {
    "message": "2021-03-17,debug,Debug Message"
  }
}
{
  "timestamp": "2021-03-17T00:00:00-00:00",
  "record": {
    "severity": "debug",
    "message": "Debug Message"
  }
}

Parse the field message with differing delimiters for header and fields

Configuration:

- type: csv_parser
  parse_from: message
  delimiter: "+"
  header_delimiter: ","
  header: 'id,severity,message'
Input record Output record
{
  "timestamp": "",
  "record": {
    "message": "1+debug+\"\"Debug Message\"\""
  }
}
{
  "timestamp": "",
  "record": {
    "id": "1",
    "severity": "debug",
    "message": "\"Debug Message\""
  }
}

Parse the field message using dynamic field names

Dynamic field names can be had when leveraging file_input's label_regex.

Configuration:

- type: file_input
  include:
  - ./dynamic.log
  start_at: beginning
  label_regex: '^#(?P<key>.*?): (?P<value>.*)'

- type: csv_parser
  delimiter: ","
  header_label: Fields

Input File:

#Fields: "id,severity,message"
1,debug,Hello
Input record Output record

Entry (from file_input):

{
  "timestamp": "",
  "labels": {
    "fields": "id,severity,message"
  },
  "record": {
    "message": "1,debug,Hello"
  }
}
{
  "timestamp": "",
  "labels": {
    "fields": "id,severity,message"
  },
  "record": {
    "id": "1",
    "severity": "debug",
    "message": "Hello"
  }
}