Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mappings: Remove ability to disable _source field #10915

Merged
merged 1 commit into from
May 6, 2015

Conversation

rjernst
Copy link
Member

@rjernst rjernst commented May 1, 2015

In order to reindex documents, _source must be enabled. There are current features (eg. update API) and future features (eg. reindex API) that depend on _source. This change locks down the field so that
it can no longer be disabled. It also removes legacy settings compress/compress_threshold.

For users who "dont use" _source, we have a best_compression option that will store it with generally half the space of previous versions, still keeping the ability to reindex. For users concerned with indexing speed, a lot of work was put into stored fields merging/compression in Lucene 5.

closes #8142

@rjernst rjernst added >breaking v2.0.0-beta1 :Search Foundations/Mapping Index mappings, including merging and defining field types labels May 1, 2015
@jpountz
Copy link
Contributor

jpountz commented May 1, 2015

I am +1 on that change and the diff looks good to me. Could you add a note to the migration guide?

The issue was a bit controversial so @clintongormley could you confirm if this change is good to go?

@javanna
Copy link
Member

javanna commented May 1, 2015

+1 on the change too. Does this mean we can also remove the option to store fields separately in the mapping given that we always have the source? Or are there still usecases for that?

@rmuir
Copy link
Contributor

rmuir commented May 1, 2015

+1 I think this is our only chance to do this.

@kimchy
Copy link
Member

kimchy commented May 1, 2015

the place where I saw it being used is for large documents, where most search requests require only a few fields, and parsing the whole request and extracting it was expensive (the user I helped with it ended up being significant). But, it is an outlier in terms of how often I actually saw it.

I like the idea of removing it, since it will simplify the whole crazy logic around handling stored fields and data extracted from source. I think we do need to have a pattern to users that have large documents and want fast access to only small amount of small fields quickly when loading hits, my ideas are doc values or parent/child now that we support fetching inner childs (or just parent)

@jpountz
Copy link
Contributor

jpountz commented May 1, 2015

Or are there still usecases for that?

It can still be useful for "generated" fields (eg. fields that are populated through copy_to)

I think we do need to have a pattern to users that have large documents and want fast access to only small amount of small fields quickly when loading hits

Maybe another option than doc values and parent/child could be to break up _source into several stored fields (#9034)?

@rjernst
Copy link
Member Author

rjernst commented May 1, 2015

@jpountz Regarding migration docs, I pushed a commit with a simple migration note.

@jpountz
Copy link
Contributor

jpountz commented May 1, 2015

Thanks @rjernst it looks good!

@kimchy
Copy link
Member

kimchy commented May 1, 2015

+1, LGTM

@clintongormley
Copy link
Contributor

+1

rjernst added a commit to rjernst/elasticsearch that referenced this pull request May 6, 2015
Current features (eg. update API) and future features (eg. reindex API)
depend on _source. This change locks down the field so that
it can no longer be disabled. It also removes legacy settings
compress/compress_threshold.

closes elastic#8142
closes elastic#10915
Current features (eg. update API) and future features (eg. reindex API)
depend on _source. This change locks down the field so that
it can no longer be disabled. It also removes legacy settings
compress/compress_threshold.

closes elastic#8142
closes elastic#10915
@faxm0dem
Copy link

faxm0dem commented Jul 6, 2015

Just FTR consider this a protest comment. Please do not remove this feature. There are use cases for ES as index-only and not storage

@rjernst
Copy link
Member Author

rjernst commented Jul 7, 2015

@faxm0dem This pull request was effectively reverted in #11171.

@faxm0dem
Copy link

faxm0dem commented Jul 7, 2015

Awesome thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking :Search Foundations/Mapping Index mappings, including merging and defining field types
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mappings: Ensure that reindexing is always possible
7 participants