Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter path refactoring #14390

Merged
merged 2 commits into from
Nov 16, 2015
Merged

Filter path refactoring #14390

merged 2 commits into from
Nov 16, 2015

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented Oct 30, 2015

Jackson has been updated to version 2.6 in #13344, we can now reimplement the filter_path feature using the new Jackson streaming support.

No more custom hand-made logic (made in #10980), we can now implement a Jackson's TokenFilter and have a more readable and more maintainable code.

Bonus points:

  • less code
  • better performance
  • closes add escaping for JSON response body filtering #11560 since JSON fields that contains dots (like .marvel-es-XX) are now correctly filtered
  • allows to filter _source in search response hits / get document response / multi get document response

@clintongormley
Copy link
Contributor

@tlrx note that fields can no longer contain embedded dots

@tlrx
Copy link
Member Author

tlrx commented Nov 9, 2015

@clintongormley Yes I know, but indices still can have dot in names and the response filtering must be able to filter them correctly

@danielmitterdorfer danielmitterdorfer self-assigned this Nov 9, 2015
/**
* Evaluates if a property name matches one of the given filter paths.
*/
private TokenFilter evaluate(String name, FilterPath[] filters) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do yo need the parameter filters? The only call of evaluatejust uses the field filters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I'll need this field in an upcoming pull request where filters will be removed in favour of includes and excludes filters so I'd like to keep it :)

@danielmitterdorfer
Copy link
Member

I left a few comments but apart from that LGTM. I am not sure whether somebody else wants to check so I am leaving the 'review' label for now. @tlrx Feel free to update it.

@danielmitterdorfer danielmitterdorfer removed their assignment Nov 9, 2015
@tlrx
Copy link
Member Author

tlrx commented Nov 10, 2015

@danielmitterdorfer Thanks for your review! I updated the code according to your comments about static/final. I'd like to keep the null check for now if that's fine for you.

I'll look for someone else to review it.

if (generator instanceof FilteringGeneratorDelegate) {
return (FilteringGeneratorDelegate) generator;
}
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too happy with those two methods. Could the constructor take those sub generators explicitly instead? Or maybe the constructor could take JsonGenerator jsonGenerator, String... filters and create them in the constructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the constructor take those sub generators explicitly instead?

GeneratorBase is only used internally within this class and having a constructor that takes all the sub generators would exposes it to the outside.

Or maybe the constructor could take JsonGenerator jsonGenerator, String... filters and create them in the constructor?

I'd like to keep the filters out of this class as much as possible and let the *XContent classes instanciate everything is needed... As an alternative what do you think of:

    public JsonXContentGenerator(JsonGenerator generator) {
        this.generator = generator;

        if (generator instanceof FilteringGeneratorDelegate) {
            this.filter = (FilteringGeneratorDelegate) generator;
        } else {
            this.filter = null;
        }

        if (generator instanceof GeneratorBase) {
            this.base = (GeneratorBase) generator;
        } else if (filter != null && filter.getDelegate() instanceof GeneratorBase) {
            this.base = (GeneratorBase) filter.getDelegate();
        } else {
            this.base = null;
        }
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still unhappy about this: if for some reason you need to add another wrapper, it could hide the FilteringGeneratorDelegate and JsonXContentGenerator wouldn't know anymore that the content is filtered. I'd rather like something that does not depend on "instanceof".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, now I understand your point... It makes sense and I'm OK with both of your suggestions.

Would this be OK for you?

public JsonXContentGenerator(JsonGenerator jsonGenerator, String... filters) {
        if (jsonGenerator instanceof GeneratorBase) {
            this.base = (GeneratorBase) jsonGenerator;
        } else {
            this.base = null;
        }

        if (CollectionUtils.isEmpty(filters)) {
            this.filter = null;
            this.generator = jsonGenerator;
        } else {
            this.filter = new FilteringGeneratorDelegate(jsonGenerator, new FilterPathBasedFilter(filters), true, true);
            this.generator = this.filter;
        }
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@jpountz
Copy link
Contributor

jpountz commented Nov 16, 2015

I left some comments but I like the change in general, eg. how it handles raw fields.

@tlrx
Copy link
Member Author

tlrx commented Nov 16, 2015

@jpountz I updated the code following your comments, can you give it another look please?

@jpountz
Copy link
Contributor

jpountz commented Nov 16, 2015

LGTM

@tlrx tlrx force-pushed the filter-path-refactoring branch from 6a9fffb to d538f0d Compare November 16, 2015 16:28
@tlrx tlrx merged commit d538f0d into elastic:master Nov 16, 2015
@tlrx tlrx removed the review label Nov 16, 2015
@tlrx tlrx removed the v2.1.0 label Nov 16, 2015
@tlrx
Copy link
Member Author

tlrx commented Nov 16, 2015

@jpountz thanks!

@reinaldoluckman
Copy link

At this point in ES 2.1.0, there is some way to make a search like

http://localhost:9200/twitter/tweet/_search?q=message:elastic

And receive an array of _source's without any metadata like

[{
"user" : "yoyo",
"postDate" : "2009-11-15T14:12:12",
"message" : "Elastic is fun"
}, {
"user" : "bulgogi",
"postDate" : "2009-11-15T14:12:12",
"message" : "Elastic Search is cool"
}, {
"user" : "kimchy",
"postDate" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}]

@tlrx
Copy link
Member Author

tlrx commented Dec 7, 2015

For now, a GET /_search?filter_path=hits.hits._source will return only the _source of your document, containing all the fields and without metadata fields like _id, _index etc. Note that the structure of the JSON will be unchanged.

@tlrx tlrx added the v2.2.0 label Dec 11, 2015
@tlrx tlrx deleted the filter-path-refactoring branch May 19, 2016 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add escaping for JSON response body filtering
5 participants