[Ingest Manager] Align namespace validation rules #75846

michalpristas · 2020-08-25T07:21:40Z

With namespace validation introduced here https://github.com/elastic/kibana/pull/75381/files
validation rules are not aligned between kibana and agent (https://github.com/elastic/beats/blob/f06bcc5d401d7b6fc833bc0e516064ace0a68d45/x-pack/elastic-agent/pkg/agent/application/filters/stream_checker.go#L173)

What is missing is

namespace is allowed to contain . but is not allowed to be equal to . or ..
namespace cannot be longer than 255 chars
namespace cannot start with -, _ or +

we should either add it to fleet validation or remove them from agent.
I took the rules from here: https://github.com/elastic/elasticsearch/blob/master/docs/reference/indices/create-index.asciidoc

cc @jen-huang

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-08-25T07:21:42Z

Pinging @elastic/ingest-management (Feature:Fleet)

elasticmachine · 2020-08-25T07:21:42Z

Pinging @elastic/ingest-management (Team:Ingest Management)

jen-huang · 2020-08-25T15:08:28Z

Consolidating discussion from https://github.com/elastic/kibana/pull/75381/files#r476228944:

I think the validation is incorrect,

Cannot be longer than 255 bytes (note it is bytes, so multi-byte characters will count towards the 255 limit faster)

This mean, type + dataset + namespace <= 255 bytes.

Originally posted by @ph in #75381 (comment)

what we do in agent is that we put 255 constraint on dataset, namespace and a product of concatenation, so eventaully we will fail when index > 255.
putting some not communicated constraint felt weird (such as 255/3 or some other math) so i used 255 on each part and result as well.
as namespace is created before configuration and can lead to different indexes based on dataset and type it would make sense to agree on some constraint which can be applied at the time of specifying namespace.

Originally posted by @michalpristas in #75381 (comment)

jen-huang · 2020-08-25T15:21:31Z

namespace is allowed to contain . but is not allowed to be equal to . or ..
namespace cannot start with -, _ or +

Namespace is the last part of our indexing strategy. These ES rules only apply to the complete index name. A namespace of . or .. will still generate a valid index name like logs-system.auth-. A namespace of + will generate logs-system.auth-+. Both of these are valid index names. Please correct me if I'm wrong.

namespace cannot be longer than 255 chars
what we do in agent is that we put 255 constraint on dataset, namespace and a product of concatenation, so eventaully we will fail when index > 255.

The 255 total index name size in bytes is the reason why I left out that original constraint in Kibana, as we have no way of knowing the size of the other parts of the indexing strategy at the time user enters the namespace. Also, length of string != number of bytes (English alphabet characters are one byte, but Chinese characters are multibyte). Because of these reasons, I don't think we can enforce a rule for namespace that is guaranteed to work across all cases. We can set up a business rule that will work for most cases, such as limiting namespace to 50 characters length maximum in UI and Agent. I doubt many users will reach the index name limit through natural use anyway (I think even 50 characters is more than enough), but enforcing a strict limit will provide some safety. What do you think of this rule?

ph · 2020-08-25T15:27:04Z

Thanks @jen-huang for moving it

--

Good points, I think we should come up with a limit for every data_stream field, we have a spec, we can enforce that.

data_stream.type is a limited set (traces, logs, metrics, synthetics) 10 is max (20 bytes)
data_stream.dataset 255-50-20 -10 (buffer) = 75 bytes.
data_stream.namespace => 50 bytes seems like a good limit.

The above limits seems goods, assuming that most of theses will be ascii values, maybe namespace could have multibytes chars.

cc @ruflin WDYT?

ph · 2020-08-25T15:28:43Z

Changed the maths above..

ruflin · 2020-08-26T07:31:48Z

20 bytes for the prefix should definitively be enough. For the other two I'm torn. I can see potential uses cases where dataset can become long, others where the namespace is long. To keep things simple and have a buffer, what about having a limit of 100 for dataset and namespace. This would max out at 222 bytes in the worst case (we have 2 -).

BTW: I think I fail at your math: 255-50-20 -10 = (1)75 ?

michalpristas · 2020-09-02T08:18:20Z

i like the proposal with limits per segment.
20/175/50 sounds like enough for everything, if we agree on these numbers or ones proposed by nicolas i will create a PR updating our current rules

jen-huang · 2020-09-08T19:25:31Z

Just to clarify, the 20/175/50 limits are bytes, not characters?

ruflin · 2020-09-09T06:17:31Z

Yes, bytes. My preference is still on 20/100/100 ;-)

jen-huang · 2020-09-25T00:34:51Z

Hi @michalpristas, I opened #78522 to implement 100 bytes limit on namespace following @ruflin's 20/100/100 proposal. I don't think merging this PR necessarily has to be synced with a corresponding agent PR, so this is just an FYI.

michalpristas · 2020-09-29T16:35:52Z

no it does not, thanks for the ping

michalpristas · 2020-09-30T08:07:43Z

@jen-huang any plans on putting on restrictions on type and dataset ?
same character set should be restricted, length limits and type should not start with -, _, + (unless we agree on list of accepted types)

michalpristas · 2020-09-30T08:26:00Z

agent pr: elastic/beats#21406

jen-huang · 2020-09-30T17:40:31Z

@michalpristas type and dataset aren't controlled by Kibana, just passed through from packages. @ruflin / @mtojek what do you think about adding validators on the packages side to limit the length of those fields?

ruflin · 2020-10-08T08:09:50Z

++ on validating these already in the package-spec. @ycombinator WDYT?

ycombinator · 2020-10-08T14:21:52Z

Agreed. Just make a new issue in the package-spec repo with the desired validation rules: https://github.com/elastic/package-spec/issues/new.

ruflin · 2020-10-12T07:27:39Z

I quickly filed elastic/package-spec#57 Would be good to fill in more details.

michalpristas added Feature:Fleet Fleet team's agent central management project Team:Fleet Team label for Observability Data Collection Fleet team labels Aug 25, 2020

michalpristas mentioned this issue Aug 25, 2020

[Elastic Agent] Enforce validation on namespace elastic/beats#20693

Closed

jen-huang mentioned this issue Aug 25, 2020

[Ingest Manager] Add namespace validation #75381

Merged

2 tasks

jen-huang self-assigned this Aug 25, 2020

jen-huang mentioned this issue Sep 25, 2020

[Ingest Manager] Add namespace max length limit #78522

Merged

jen-huang closed this as completed in #78522 Sep 29, 2020

ruflin mentioned this issue Oct 12, 2020

Validate dataset and namespace elastic/package-spec#57

Open

ebeahan mentioned this issue Oct 13, 2020

[RFC] data_stream fields elastic/ecs#980

Merged

jen-huang mentioned this issue Nov 4, 2020

Logstash Integration with Elasticsearch Data Streams elastic/logstash#12178

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Ingest Manager] Align namespace validation rules #75846

[Ingest Manager] Align namespace validation rules #75846

michalpristas commented Aug 25, 2020

elasticmachine commented Aug 25, 2020

elasticmachine commented Aug 25, 2020

jen-huang commented Aug 25, 2020

jen-huang commented Aug 25, 2020 •

edited

Loading

ph commented Aug 25, 2020 •

edited

Loading

ph commented Aug 25, 2020

ruflin commented Aug 26, 2020

michalpristas commented Sep 2, 2020 •

edited

Loading

jen-huang commented Sep 8, 2020

ruflin commented Sep 9, 2020

jen-huang commented Sep 25, 2020

michalpristas commented Sep 29, 2020

michalpristas commented Sep 30, 2020

michalpristas commented Sep 30, 2020

jen-huang commented Sep 30, 2020

ruflin commented Oct 8, 2020

ycombinator commented Oct 8, 2020

ruflin commented Oct 12, 2020

[Ingest Manager] Align namespace validation rules #75846

[Ingest Manager] Align namespace validation rules #75846

Comments

michalpristas commented Aug 25, 2020

elasticmachine commented Aug 25, 2020

elasticmachine commented Aug 25, 2020

jen-huang commented Aug 25, 2020

jen-huang commented Aug 25, 2020 • edited Loading

ph commented Aug 25, 2020 • edited Loading

ph commented Aug 25, 2020

ruflin commented Aug 26, 2020

michalpristas commented Sep 2, 2020 • edited Loading

jen-huang commented Sep 8, 2020

ruflin commented Sep 9, 2020

jen-huang commented Sep 25, 2020

michalpristas commented Sep 29, 2020

michalpristas commented Sep 30, 2020

michalpristas commented Sep 30, 2020

jen-huang commented Sep 30, 2020

ruflin commented Oct 8, 2020

ycombinator commented Oct 8, 2020

ruflin commented Oct 12, 2020

jen-huang commented Aug 25, 2020 •

edited

Loading

ph commented Aug 25, 2020 •

edited

Loading

michalpristas commented Sep 2, 2020 •

edited

Loading