Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add collection data stream and ilm policy #118

Merged
merged 7 commits into from
Dec 14, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions custom_subsets/elastic_endpoint/collection/collection.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to double check that this is the intent but this mapping is pretty reduced. The only fields that will be searchable within ES are @timestamp, data_stream, ecs.version, and the event.* fields. Is that what we want?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonathan-buttner this is the desired functionality.

This will be read by a telemetry server and it will search by event.ingested. We don't need to index any other fields as we'll only need to search for the latest docs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

name: collection
fields:
base:
fields:
"@timestamp": {}
data_stream:
fields: "*"
ecs:
fields:
version: {}
event:
fields:
action: {}
category: {}
created: {}
code: {}
dataset: {}
hash: {}
id: {}
ingested: {}
kind: {}
module: {}
outcome: {}
provider: {}
sequence: {}
severity: {}
type: {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"policy": {
"phases": {
"hot": {
"min_age": "0ms",
"actions": {
"rollover": {
"max_size": "1gb",
"max_age": "7d",
"max_docs": 10000
}
}
},
"delete": {
"min_age": "10m",
"actions": {
"delete": {}
}
}
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"description": "Pipeline for setting event.ingested",
"processors": [
{
"set": {
"field": "event.ingested",
"value": "{{ _ingest.timestamp }}",
"ignore_failure": true
}
}
]
}
191 changes: 191 additions & 0 deletions package/endpoint/data_stream/collection/fields/fields.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
- name: '@timestamp'
level: core
required: true
type: date
description: 'Date/time when the event originated.

This is the date/time extracted from the event, typically representing when the event was generated by the source.

If the event source has no original timestamp, this value is typically populated by the first time the event was received by the pipeline.

Required field for all events.'
example: '2016-05-23T08:05:34.853Z'
- name: data_stream
title: data_stream
group: 2
description: Fields describing the new indexing strategy <type>-<dataset>-<namespace>
type: group
fields:
- name: dataset
level: custom
type: constant_keyword
description: Data stream dataset name.
default_field: false
- name: namespace
level: custom
type: constant_keyword
description: Data stream namespace.
default_field: false
- name: type
level: custom
type: constant_keyword
description: Data stream type.
default_field: false
- name: ecs
title: ECS
group: 2
description: Meta-information specific to ECS.
type: group
fields:
- name: version
level: core
required: true
type: keyword
ignore_above: 1024
description: 'ECS version this event conforms to. `ecs.version` is a required field and must exist in all events.

When querying across multiple indices -- which may conform to slightly different ECS versions -- this field lets integrations adjust to the schema version of the events.'
example: 1.0.0
- name: event
title: Event
group: 2
description: 'The event fields are used for context information about the log or metric event itself.

A log is defined as an event containing details of something that happened. Log events must include the time at which the thing happened. Examples of log events include a process starting on a host, a network packet being sent from a source to a destination, or a network connection between a client and a server being initiated or closed. A metric is defined as an event containing one or more numerical measurements and the time at which the measurement was taken. Examples of metric events include memory pressure measured on a host and device temperature. See the `event.kind` definition in this section for additional details about metric and state events.'
type: group
fields:
- name: action
level: core
type: keyword
ignore_above: 1024
description: 'The action captured by the event.

This describes the information in the event. It is more specific than `event.category`. Examples are `group-add`, `process-started`, `file-created`. The value is normally defined by the implementer.'
example: user-password-change
- name: category
level: core
type: keyword
ignore_above: 1024
description: 'This is one of four ECS Categorization Fields, and indicates the second level in the ECS category hierarchy.

`event.category` represents the "big buckets" of ECS categories. For example, filtering on `event.category:process` yields all events relating to process activity. This field is closely related to `event.type`, which is used as a subcategory.

This field is an array. This will allow proper categorization of some events that fall in multiple categories.'
example: authentication
- name: code
level: extended
type: keyword
ignore_above: 1024
description: 'Identification code for this event, if one exists.

Some event sources use event codes to identify messages unambiguously, regardless of message language or wording adjustments over time. An example of this is the Windows Event ID.'
example: 4648
- name: created
level: core
type: date
description: 'event.created contains the date/time when the event was first read by an agent, or by your pipeline.

This field is distinct from @timestamp in that @timestamp typically contain the time extracted from the original event.

In most situations, these two timestamps will be slightly different. The difference can be used to calculate the delay between your source generating an event, and the time when your agent first processed it. This can be used to monitor your agent''s or pipeline''s ability to keep up with your event source.

In case the two timestamps are identical, @timestamp should be used.'
example: '2016-05-23T08:05:34.857Z'
- name: dataset
level: core
type: keyword
ignore_above: 1024
description: 'Name of the dataset.

If an event source publishes more than one type of log or events (e.g. access log, error log), the dataset is used to specify which one the event comes from.

It''s recommended but not required to start the dataset name with the module name, followed by a dot, then the dataset name.'
example: apache.access
- name: hash
level: extended
type: keyword
ignore_above: 1024
description: Hash (perhaps logstash fingerprint) of raw field to be able to demonstrate log integrity.
example: 123456789012345678901234567890ABCD
- name: id
level: core
type: keyword
ignore_above: 1024
description: Unique ID to describe the event.
example: 8a4f500d
- name: ingested
level: core
type: date
description: 'Timestamp when an event arrived in the central data store.

This is different from `@timestamp`, which is when the event originally occurred. It''s also different from `event.created`, which is meant to capture the first time an agent saw the event.

In normal conditions, assuming no tampering, the timestamps should chronologically look like this: `@timestamp` < `event.created` < `event.ingested`.'
example: '2016-05-23T08:05:35.101Z'
default_field: false
- name: kind
level: core
type: keyword
ignore_above: 1024
description: 'This is one of four ECS Categorization Fields, and indicates the highest level in the ECS category hierarchy.

`event.kind` gives high-level information about what type of information the event contains, without being specific to the contents of the event. For example, values of this field distinguish alert events from metric events.

The value of this field can be used to inform how these kinds of events should be handled. They may warrant different retention, different access control, it may also help understand whether the data coming in at a regular interval or not.'
example: alert
- name: module
level: core
type: keyword
ignore_above: 1024
description: 'Name of the module this data is coming from.

If your monitoring agent supports the concept of modules or plugins to process events of a given source (e.g. Apache logs), `event.module` should contain the name of this module.'
example: apache
- name: outcome
level: core
type: keyword
ignore_above: 1024
description: 'This is one of four ECS Categorization Fields, and indicates the lowest level in the ECS category hierarchy.

`event.outcome` simply denotes whether the event represents a success or a failure from the perspective of the entity that produced the event.

Note that when a single transaction is described in multiple events, each event may populate different values of `event.outcome`, according to their perspective.

Also note that in the case of a compound event (a single event that contains multiple logical events), this field should be populated with the value that best captures the overall success or failure from the perspective of the event producer.

Further note that not all events will have an associated outcome. For example, this field is generally not populated for metric events, events with `event.type:info`, or any events for which an outcome does not make logical sense.'
example: success
- name: provider
level: extended
type: keyword
ignore_above: 1024
description: 'Source of the event.

Event transports such as Syslog or the Windows Event Log typically mention the source of an event. It can be the name of the software that generated the event (e.g. Sysmon, httpd), or of a subsystem of the operating system (kernel, Microsoft-Windows-Security-Auditing).'
example: kernel
- name: sequence
level: extended
type: long
format: string
description: 'Sequence number of the event.

The sequence number is a value published by some event sources, to make the exact ordering of events unambiguous, regardless of the timestamp precision.'
- name: severity
level: core
type: long
format: string
description: 'The numeric severity of the event according to your event source.

What the different severity values mean can be different between sources and use cases. It''s up to the implementer to make sure severities are consistent across events from the same source.

The Syslog severity belongs in `log.syslog.severity.code`. `event.severity` is meant to represent the severity according to the event source (e.g. firewall, IDS). If the event source does not publish its own severity, you may optionally copy the `log.syslog.severity.code` to `event.severity`.'
example: 7
- name: type
level: core
type: keyword
ignore_above: 1024
description: 'This is one of four ECS Categorization Fields, and indicates the third level in the ECS category hierarchy.

`event.type` represents a categorization "sub-bucket" that, when used along with the `event.category` field values, enables filtering events down to a level appropriate for single visualization.

This field is an array. This will allow proper categorization of some events that fall in multiple event types.'
9 changes: 9 additions & 0 deletions package/endpoint/data_stream/collection/manifest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
title: Endpoint Alert Collection
type: logs
dataset: endpoint.diagnostic.collection
hidden: true
ilm_policy: logs-endpoint.collection-diagnostic
elasticsearch:
index_template:
mappings:
dynamic: false
Loading