Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: monitoring template type removal #39336

Closed
wants to merge 5 commits into from

Conversation

jakelandis
Copy link
Contributor

This change removes the "doc" type from the monitoring templates. The monitoring templates are a bit special since they can be sent via the REST layer for an external monitoring cluster and they can be associated to ingest node pipelines. Much more detail below.

Relates #38637

Before this PR is merged some actions and decisions need to made:

[ ] 7.0 Kibana must upgrade to use only typeless APIs to read the monitoring data (writing isn't an issue since Kibana does not write directly to the index)
[ ] 7.0 Kibana must use a wild card pattern to be able to read monitoring indexes from both 6 and 7 (e.g. monitoring-es-*)
[ ] Elasticsearch must agree that only 6.7 (of the 6.x) family can be used with a 7.x monitoring cluster. (versions prior to 6.7 will error [1])
[ ] Elasticsearch must agree that when 7.x is monitoring 6.7 clusters that deprecation warnings are OK
[ ] Any other consumers of .monitoring-* indexes must use the typeless API's (I am not aware of any other consumers)
[ ] Tests, docs, code clean up , more manual testing etc. (after initial discussion)
[ ] Follow up 6.7 PR for deprecration/deprecation info API for removed settings

Primary change

Representation compatible with REST layer

The main change here is actually pretty small, but requires a bit of explaination:

Monitoring templates can be installed to the cluster either locally via the transport client or remotely via the REST API. In both cases it starts with the internal representation (in memory and viewable via cluster state) which requires the "_doc" nesting inside the mappings. The REST layer prevents adding a mapping with any type, even the internal "_doc" type unless you pass the include_type_name param (which causes a deprecation warning and this PR removes). This is why the template in source control still has the "_doc" type, but you can not copy and paste this document to use with the REST API. Keeping the internal format in source control works since it is immediatly read into it's internal form via MonitoringTemplateUtils/TemplateUtils and doesn't ever go through the REST layer. However since the remote goest through the REST layer, the code needs to convert the internal format (has "_doc") -> external format (does not have "_doc"). The small but susbstative change here is the removal of the type for the external representation in TemplateHttpResource.templateToHttpEntity and IndexTemplateMetaData.Builder.removeType(..)

It also worth noting, (mosty) due to the fact the monitoring templates must support a remote export, monitoring Templates do not currently use the TemplateUpgradeService like the other .templates.

Additional changes

There are a couple other changes which adds alot of noise to diff, but are still important and require discussion.

Change of template name

This PR changes the template name from .monitoring-[system] (e.g. .monitoring-es) to .monitoring-[system]-[version] (e.g. .monitoring-es-7). This change is done so that that when a 7.x cluster is introduced it does not impact the ability to monitor data from 6.7. In prior verions this was handled by changing the elder version (not the current version's) template name and ensuring that the patterns were correct for the given version. Essentially this changes from re-naming the old template, to leaving the old template alone, and re-naming the new template. This is necessary because the old template (v6) still requires the "doc" type and if 7.x puts any template that includes the type requires the use of the deprecrated include_type_name ... and the end result is that we are right back where we started, with deprecations in our logs that a user has no control over. (which would defeat a large reason for this PR). By leaving (v6) alone and only adding v7, we need to avoid naming collisions (and another type issue not described here for brevity) hence the rename. This PR also removes a flag that controls these settings, and will need deprecate in 6.7.

Removal of ingest node pipeline

Since this PR introduces a change that only allows to 7.x cluster to remotely manage a 6.7 cluster [1], all of the ingest node pipeline for monitoring is essentially dead code. Version 6.x had an empty pipeline (a pipeline that did nothing), and if carried forward, 7.x would also have a pipeline that does nothing. Requiring an ingest node pipeline was means to upgrade to v6, and is no longer needed for v7. There is additional overhead of using an empty pipeline such as serialize/deserialize as well as deployment concerns to require a ingest capable node that does nothing. This does limit our ability in 7.x some for backward compatible changes, but we can pretty easily re-introduce if needed. Further the plan for 8.x is for this to be replaced by beats.

[1] output of 6.0.0 trying to push data to a 7.0 cluster with this change (6.7 works because it sends the include_type_name flag)

elasticsearch1    | org.elasticsearch.client.ResponseException: method [PUT], host [http://host.docker.internal:55555], URI [/_template/.monitoring-alerts?filter_path=*.version], status line [HTTP/1.1 400 Bad Request]
elasticsearch1    | {"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"Root mapping definition has unsupported parameters:  [doc : {dynamic=false, properties={metadata={properties={severity={type=short}, cluster_uuid={type=keyword}, watch={type=keyword}, link={type=keyword}, type={type=keyword}, version={type=keyword}}}, update_timestamp={type=date}, prefix={type=text}, message={type=text}, suffix={type=text}, resolved_timestamp={type=date}, timestamp={type=date}}}]"}],"type":"mapper_parsing_exception","reason":"Failed to parse mapping [_doc]: Root mapping definition has unsupported parameters:  [doc : {dynamic=false, properties={metadata={properties={severity={type=short}, cluster_uuid={type=keyword}, watch={type=keyword}, link={type=keyword}, type={type=keyword}, version={type=keyword}}}, update_timestamp={type=date}, prefix={type=text}, message={type=text}, suffix={type=text}, resolved_timestamp={type=date}, timestamp={type=date}}}]","caused_by":{"type":"mapper_parsing_exception","reason":"Root mapping definition has unsupported parameters:  [doc : {dynamic=false, properties={metadata={properties={severity={type=short}, cluster_uuid={type=keyword}, watch={type=keyword}, link={type=keyword}, type={type=keyword}, version={type=keyword}}}, update_timestamp={type=date}, prefix={type=text}, message={type=text}, suffix={type=text}, resolved_timestamp={type=date}, timestamp={type=date}}}]"}},"status":400}

cc @jtibshirani @yaronp68 @timroes

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

@jpountz
Copy link
Contributor

jpountz commented Feb 25, 2019

Elasticsearch must agree that only 6.7 (of the 6.x) family can be used with a 7.x monitoring cluster. (versions prior to 6.7 will error [1])

Given that monitoring clusters must be upgraded before the cluster they monitor and that we only support upgrading to 7.x from 6.7, I think this one comes for free.

Elasticsearch must agree that when 7.x is monitoring 6.7 clusters that deprecation warnings are OK

+1 Deprecation warnings are ok for temporary situations.

@jakelandis
Copy link
Contributor Author

@pickypg - could I get your opinion on the direction of this PR ? In particular +/- 1 on the additional changes.

@pickypg
Copy link
Member

pickypg commented Feb 27, 2019

@jakelandis

Change of template name

This is necessary because the old template (v6) still requires the "doc" type

By not replacing the old template, we are allowing any 6.x node to:

  1. Send the old, with-_type template.
  2. Send documents to the existing index pattern (.monitoring-es-6-*).
  3. Coexist with .monitoring-es-7-* on the same day (not so bad beyond shard waste).

We also leave a stale index template in the cluster state.

And since 7.0+ will default to disallowing documents with a _type, by default, this seems like it would break any 6.x cluster from sending its monitoring data to 7.x? Relative to the monitoring cluster being upgraded first, it is fairly common for monitoring clusters to be upgraded completely before the production cluster (e.g., 6.4 -> 6.7 -> 7.x before the production cluster, presumably running 6.4 in this example, is upgraded to 6.7 or 7.x).

Can we not upgrade the xpack_monitoring_6 ingest pipeline to drop the _type of the document, then point the documents to .monitoring-{es|kibana|logstash|beats}-7-{suffix}, then add an empty xpack_monitoring_7? (I don't know if that's possible given that it implies the ingest pipeline would need to still understand _type, which would be a very convenient workaround if it does)


Removal of ingest node pipeline

There is additional overhead of using an empty pipeline

There used to be an optimization in place that ignored empty pipelines, which bypassed that overhead. It appears to still exist:

if (pipeline.getProcessors().isEmpty()) {
return;
}

This does limit our ability in 7.x some for backward compatible changes, but we can pretty easily re-introduce if needed.

You cannot easily reintroduce the behavior once we take it away because you cannot upgrade old requests or automatically re-force the cluster to allow ingest. It's a little better than when we introduced this concept in 5.0, which did not have default_pipeline (new in 6.5), but if you do not require it upfront, then we found that users will disable node.ingest.

If we make a breaking change in 7.3 that requires a new pipeline, then 7.0 - 7.2 all would send a request without the pipeline in the request. We could provide a default_pipeline to the template and possibly upgrade the index settings for any existing indices, but the re-introduction would re-require the monitoring cluster to enable node.ingest: true. That's a lot of effort assuming that the empty ingest pipeline is effectively a noop.

Further the plan for 8.x is for this to be replaced by beats.

It's probably worth calling out that we hope to replace the ingest side by Beats before 8.x, but it won't become the official way until then to allow ourselves time to handle the transition better, as well as fix any issues that may crop up.

@jakelandis
Copy link
Contributor Author

@pickypg - thanks for the thoughtful reply. I agree with many of your points in particular w/r/t ingest node "You cannot easily reintroduce the behavior once we take it away" and will ensure that we preserve this behavior.

And since 7.0+ will default to disallowing documents with a _type, by default, this seems like it would break any 6.x cluster from sending its monitoring data to 7.x?

I think may have omitted a critical detail in my explanation. By leaving the existing typed template alone, 6.x (irregardless of where it sends its data) will result in writing to .monitoring-{es|kibana|logstash|beats}-6-{suffix} and the update here for 7.x will result in writing to .monitoring-{es|kibana|logstash|beats}-7-{suffix}. e.g. -6 and -7 indexes can co-exist and both can actively be written to for the entire duration of 7.x (assuming a 6.7 is being remotely monitored, and a 7.x cluster is also monitored). This requires updating Kibana's search from .monitoring-{es|kibana|logstash|beats}-6-* to .monitoring-{es|kibana|logstash|beats}-* to allow Kibana to read either of the index patterns.

The alternative, as you propose, is to use the ingest pipeline to remove the type so that we can immediately start pushing to -7. That would imply that -6 and -7 would never both be actively written. I infer this is how other upgrades happened. This model also seems that it would need to update Kibana to read both index patterns.

Are there immediate advantages or disadvantages to choosing one method over the other ?

@pickypg
Copy link
Member

pickypg commented Feb 27, 2019

@jakelandis Any time!

This requires updating Kibana's search from .monitoring-{es|kibana|logstash|beats}-6-* to .monitoring-{es|kibana|logstash|beats}-* to allow Kibana to read either of the index patterns.
This model also seems that it would need to update Kibana to read both index patterns.

This is correct. Regardless, Kibana is going to need to be able to read from -7 and -6 at the same time. If we didn't support that, then Kibana is going to lose (or miss, depending on perspective) the unreferenced data. For perspective, we currently do this throughout 6.x to support the old -2 (confusingly, the 5.x format where "2" is the API format coming after 0 and 1 in 2.x) alongside the -6:

https://github.com/elastic/kibana/blob/306d06c0e99ba9f9a2b8972095be779c36768486/x-pack/plugins/monitoring/config.js#L66

In 7.x, we currently limit it to -6, but can naturally repeat what we did for supporting -2.

Are there immediate advantages or disadvantages to choosing one method over the other ?

If we can use an ingest processor to drop the _type from the request and allow any 6.x cluster to be monitored by any 7.x cluster, then I think it would have a pretty big advantage of not requiring coordination between upgrading the monitoring cluster and production cluster, when both are not already at 6.7: just upgrade the monitoring cluster first, which is already the requirement, and you're good to go. And then in both setups they quickly get to a position where their pipeline has no processors, thus requiring no serialization (or added work).

jpountz added a commit to jpountz/elasticsearch that referenced this pull request Mar 7, 2019
The monitoring bulk API accepts the same format as the bulk API, yet its concept
of types is different from "mapping types" and the deprecation warning is only
emitted as a side-effect of this API reusing the parsing logic of bulk requests.

This commit extracts the parsing logic from `_bulk` into its own class with a
new flag that allows to configure whether usage of `_type` should emit a warning
or not. Support for payloads has been removed for simplicity since they were
unused.

@jakelandis has a separate change that removes this notion of type from the
monitoring bulk API that we are considering bringing to 8.0: elastic#39336.
@jakelandis
Copy link
Contributor Author

Closing this in favor #39888

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants