Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always support ignore_malformed in the same way #90565

Merged
merged 4 commits into from
Oct 3, 2022

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Sep 30, 2022

This makes sure that all field types that support ignore_malfored do so in the same way.

Production changes:

  • All mapper has an ignoreMalformed method that must return true if the field accepts the ignore_malformed mapping parameter was configured. It defaults to false because many fields either don't have a concept of "malformed" value or don't have the ability to ignore malformed values.
  • Fix the scaled_float field to store it's field name in _ignored if it ignores any malfored values. This is how all other field mappers work.

Test changes:

  • MapperTestCase forces subclasses to declare if their supportIgnoreMalformed or not.
  • If MapperTestCase subclasses supportIgnoreMalfored they must define some exampleMalformedValues.
  • MapperTestCase always grows three new tests:
    • One that creates a field without setting ignore_malformed and verifies that all exampleMalformedValues throw expected errors
    • On that explicitly configured ignore_malformed to false and, if supportIgnoreMalformed it verifies the errors again. If not supportIgnoreMalformed it verifies that the parameter is unknown.
    • On that explicitly configured ignore_malformed to true and, if supportIgnoreMalformed it verifies that parsing doesn't produce errors and correctly produces _ignored. If not supportIgnoreMalformed it verifies that the parameter is unknown.
  • Moved some subclasesses of MapperTestCase from internalClusterTests to tests. This isn't strictly required but that's the right place for them.
  • Remove many hand rolled malformed tests, making sure that all removed examples are moved into the exampleMalformedValues

@nik9000 nik9000 added >non-issue :Search Foundations/Mapping Index mappings, including merging and defining field types v8.6.0 labels Sep 30, 2022
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Sep 30, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@nik9000
Copy link
Member Author

nik9000 commented Sep 30, 2022

My ulterior motive here is that I can reuse this infrastructure when we support ignore_malformed with synthetic _source.

This makes sure that all field types that support `ignore_malfored` do
so in the same way.

Production changes:
* All mapper has an `ignoreMalformed` method that must return `true` if
  the field accepts the `ignore_malformed` mapping parameter was
  configured. It defaults to `false` because many fields either don't
  have a concept of "malformed" value or don't have the ability to
  ignore malformed values.
* Fix the `scaled_float` field to store it's field name in `_ignored` if
  it ignores any malfored values. This is how all other field mappers
  work.

Test changes:
* `MapperTestCase` forces subclasses to declare if their
  `supportIgnoreMalformed` or not.
* If `MapperTestCase` subclasses `supportIgnoreMalfored` they must
  define some `exampleMalformedValues`.
* `MapperTestCase` always grows three new tests:
  * One that creates a field without setting `ignore_malformed` and
    verifies that all `exampleMalformedValues` throw expected errors
  * On that explicitly configured `ignore_malformed` to false and, if
    `supportIgnoreMalformed` it verifies the errors again. If not
    `supportIgnoreMalformed` it verifies that the parameter is unknown.
  * On that explicitly configured `ignore_malformed` to true and, if
    `supportIgnoreMalformed` it verifies that parsing doesn't produce
    errors and correctly produces `_ignored`. If not
    `supportIgnoreMalformed` it verifies that the parameter is unknown.
* Moved some subclasesses of `MapperTestCase` from
  `internalClusterTests` to `tests`. This isn't strictly required but
  that's the right place for them.
Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks @nik9000

I left one question around how aggregate metric double records ignored values, but we can handle that in a follow-up if it's not one with an obvious answer.


esplugin {
description 'The Mapper Annotated_text plugin adds support for text fields with markup used to inject annotation tokens into the index.'
classname 'org.elasticsearch.index.mapper.annotatedtext.AnnotatedTextPlugin'
}

if (BuildParams.isSnapshotBuild() == false) {
tasks.named("internalClusterTest").configure {
tasks.named("test").configure {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been annoying me for ages, thanks for fixing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙇


@Override
protected String[] ignoredFields() {
return new String[] { "field.value_count", "field.min", "field.max" };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting - the _ignored field is really for user inspection, so I'm not sure if this mapper should be recording all its subfields in it as they are implementation details.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah! I think that's worth discussing somewhere. I can file an issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v8.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants