Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Azure Blob Storage] Adding integration for Custom Azure Blob Storage Input #4693

Merged
merged 8 commits into from
Feb 7, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
/packages/awsfargate @elastic/obs-cloud-monitoring
/packages/azure_application_insights @elastic/obs-cloud-monitoring
/packages/azure_billing @elastic/obs-cloud-monitoring
/packages/azure_blob_storage @elastic/security-external-integrations
/packages/azure @elastic/obs-cloud-monitoring
/packages/azure_metrics @elastic/obs-cloud-monitoring
/packages/barracuda @elastic/security-external-integrations
Expand Down
3 changes: 3 additions & 0 deletions packages/azure_blob_storage/_dev/build/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
dependencies:
ecs:
reference: git@v8.5.1
24 changes: 24 additions & 0 deletions packages/azure_blob_storage/_dev/build/docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Custom Azure Blob Storage Input

Use the `azure blob storage input` to read content from files stored in containers which reside on your Azure Cloud.
The input can be configured to work with and without polling, though currently, if polling is disabled it will only
perform a one time passthrough, list the file contents and end the process. Polling is generally recommented for most cases
even though it can get expensive with dealing with a very large number of files.

*To mitigate errors and ensure a stable processing environment, this input employs the following features :*

1. When processing azure blob containers, if suddenly there is any outage, the process will be able to resume post the last file it processed
and was successfully able to save the state for.

2. If any errors occur for certain files, they will be logged appropriately, but the rest of the
files will continue to be processed normally.

3. If any major error occurs which stops the main thread, the logs will be appropriately generated,
describing said error.

[id="supported-types"]
NOTE: Currently only `JSON` is supported with respect to blob/file formats. As for authentication types, we currently have support for
`shared access keys` and `connection strings`.


Custom ingest pipelines may be added by adding the name to the pipeline configuration option, creating custom ingest pipelines can be done either through the API or the [Ingest Node Pipeline UI](/app/management/ingest/ingest_pipelines/).
5 changes: 5 additions & 0 deletions packages/azure_blob_storage/changelog.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
- version: "0.1.0"
changes:
- description: Initial Release
type: enhancement
link: https://github.com/elastic/integrations/pull/4693
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
data_stream:
dataset: {{data_stream.dataset}}
{{#if pipeline}}
pipeline: {{pipeline}}
{{/if}}
{{#if account_name}}
account_name: {{account_name}}
{{/if}}
{{#if service_account_key}}
auth.shared_credentials.account_key: {{service_account_key}}
{{/if}}
{{#if service_account_uri}}
auth.connection_string.uri: {{service_account_uri}}
{{/if}}
{{#if storage_url}}
storage_url: {{storage_url}}
{{/if}}
{{#if containers}}
containers:
{{containers}}
{{/if}}
{{#if tags}}
tags:
{{#each tags as |tag i|}}
- {{tag}}
{{/each}}
{{/if}}
{{#contains "forwarded" tags}}
publisher_pipeline.disable_host: true
{{/contains}}
{{#if processors}}
processors:
{{processors}}
{{/if}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
- name: data_stream.type
type: constant_keyword
description: Data stream type.
- name: data_stream.dataset
type: constant_keyword
description: Data stream dataset.
- name: data_stream.namespace
type: constant_keyword
description: Data stream namespace.
- name: event.module
type: constant_keyword
description: Event module
value: azure_blob_storage
- name: event.dataset
type: constant_keyword
description: Event dataset
value: azure_blob_storage.generic
- name: "@timestamp"
type: date
description: Event timestamp.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
- name: input.type
description: Type of Filebeat input.
type: keyword
- name: tags
type: keyword
description: User defined tags
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
- name: ecs.version
external: ecs
- name: log.level
external: ecs
- name: message
external: ecs
- name: event.original
external: ecs
85 changes: 85 additions & 0 deletions packages/azure_blob_storage/data_stream/generic/manifest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
title: Custom Azure Blob Storage Input
type: logs
streams:
- input: azure-blob-storage
description: Collect JSON data from configured Azure Blob Storage Container with Elastic Agent.
title: Custom Azure Blob Storage Input
template_path: abs.yml.hbs
vars:
- name: data_stream.dataset
type: text
title: Dataset name
description: |
Dataset to write data to. Changing the dataset will send the data to a different index. You can't use `-` in the name of a dataset and only valid characters for [Elasticsearch index names](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html).
default: azure_blob_storage.generic
required: true
show_user: true
- name: pipeline
type: text
title: Ingest Pipeline
description: |
The Ingest Node pipeline ID to be used by the integration.
required: false
show_user: true
- name: account_name
type: text
title: Account Name
description: |
This attribute is required for various internal operations with respect to authentication, creating service clients and blob clients which are used internally for various processing purposes.
required: true
show_user: true
- name: service_account_key
type: text
title: Service Account Key
description: |
This attribute contains the access key, found under the Access keys section on Azure Clound, under the respective storage account. A single storage account can contain multiple containers, and they will all use this common access key.
required: false
show_user: true
- name: service_account_uri
type: text
title: Service Account URI
description: |
This attribute contains the connection string, found under the Access keys section on Azure Clound, under the respective storage account. A single storage account can contain multiple containers, and they will all use this common connection string.
required: false
show_user: true
- name: storage_url
type: text
title: Storage URL
description: |
Use this attribute to specify a custom storage URL if required. By default it points to azure cloud storage. Only use this if there is a specific need to connect to a different environment where blob storage is available.
URL format : {{protocol}}://{{account_name}}.{{storage_uri}}.
required: false
show_user: false
- name: containers
type: yaml
title: Containers
description: |
"This attribute contains the details about a specific container like name, max_workers, poll and poll_interval. The attribute name is specific to a container as it describes the container name. This attribute is internally represented as an array, so we can add as many containers as we require."
required: true
show_user: true
default: |
- name: azure-container1
max_workers: 3
poll: true
poll_interval: 15s
#- name: azure-container2
# max_workers: 3
# poll: true
# poll_interval: 10s
- name: processors
type: yaml
title: Processors
multi: false
required: false
show_user: false
description: |
Processors are used to reduce the number of fields in the exported event or to enhance the event with metadata. This executes in the agent before the logs are parsed. See [Processors](https://www.elastic.co/guide/en/beats/filebeat/current/filtering-and-enhancing-data.html) for details.
- name: tags
type: text
title: Tags
description: Tags to include in the published event
required: false
default:
- forwarded
multi: true
show_user: true
24 changes: 24 additions & 0 deletions packages/azure_blob_storage/docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Custom Azure Blob Storage Input

Use the `azure blob storage input` to read content from files stored in containers which reside on your Azure Cloud.
The input can be configured to work with and without polling, though currently, if polling is disabled it will only
perform a one time passthrough, list the file contents and end the process. Polling is generally recommented for most cases
even though it can get expensive with dealing with a very large number of files.

*To mitigate errors and ensure a stable processing environment, this input employs the following features :*

1. When processing azure blob containers, if suddenly there is any outage, the process will be able to resume post the last file it processed
and was successfully able to save the state for.

2. If any errors occur for certain files, they will be logged appropriately, but the rest of the
files will continue to be processed normally.

3. If any major error occurs which stops the main thread, the logs will be appropriately generated,
describing said error.

[id="supported-types"]
NOTE: Currently only `JSON` is supported with respect to blob/file formats. As for authentication types, we currently have support for
`shared access keys` and `connection strings`.


Custom ingest pipelines may be added by adding the name to the pipeline configuration option, creating custom ingest pipelines can be done either through the API or the [Ingest Node Pipeline UI](/app/management/ingest/ingest_pipelines/).
4 changes: 4 additions & 0 deletions packages/azure_blob_storage/img/icon.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
26 changes: 26 additions & 0 deletions packages/azure_blob_storage/manifest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
format_version: 1.0.0
name: azure_blob_storage
title: Custom Azure Blob Storage Input
description: Collect JSON data from configured Azure Blob Storage Container with Elastic Agent.
type: integration
version: "0.1.0"
release: beta
conditions:
kibana.version: "^8.5.0"
license: basic
categories:
- custom
- cloud
policy_templates:
- name: azure-blob-storage
title: Custom Azure Blob Storage Input
description: Collect JSON data from configured Azure Blob Storage Container with Elastic Agent.
inputs:
- type: azure-blob-storage
title: Custom Azure Blob Storage Input
description: Collect JSON data from configured Azure Blob Storage Container with Elastic Agent.
icons:
- src: "/img/icon.svg"
type: "image/svg+xml"
owner:
github: elastic/security-external-integrations