Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Distinct Filter for Python ICAT Backend #161

Merged
merged 14 commits into from
Oct 1, 2020

Conversation

MRichards99
Copy link
Collaborator

This PR closes #141.

This adds functionality for the distinct filter. Like the database backend, either a single field can be used, or a list of fields can be entered to be "distinct". Multiple distinct filters can be used in a single request without them overwriting each other, which I didn't expect I could do at the start of looking at this issue, so that's good :)

Since Python ICAT only really supports a single field to be distinct (through the use of setAttribute() and setting DISTINCT using setAggregate()), I had to get a little creative. I used setAggregate() to add the DISTINCT part to the JPQL query, but to set which fields are distinct, I build WHERE filters within a distinct filter (one where filter per field). The WHERE filters are set to look for anything that isn't null so it'll pick up the data. The data returned from the JPQL query has all the attributes of an entity so the fields and data specified in the distinct filter are picked out and returned in the response. Having done identical requests on both backends, this seems to return exactly the same data, in the same format.

Within common/icat/helpers.py, I've moved construct_icat_query() and execute_icat_query() into its own class as it seems to make sense to do so. I made this change largely because I added a function called check_attribute_name_for_distinct() which is only used in execute_icat_query() so it seemed to make sense to move all 3 functions into their own class. Thinking about it now, I wonder if the icat_query class should go in its own file, any thoughts?

There's also a bug fix for the WHERE filter - this is explained within the commit 2aea37c.

- This changes the output of the data to a list of values, rather than the traditional list of dictionaries, containing attribute name and its value
- The primary used for this operation will be to add WHERE filters when a distinct filter is in a request, so distinct filters can accomodate multiple fields
…from helper side

- This requires the addition of creating WHERE filters within a distinct filter to deal with multiple fields within a single distinct filter
- Inside a PythonICATDistinctFieldFilter, a where filter is created for each field in the request. These are then searched for in execute_icat_query() and the data is compiled to only include data from those fields
- Combine some of the logic with the existing logic where `return_json_formattable` is True - this will ensure that datetimes are converted to an appropriate format when using distinct filters
- This commit also adds checking of the query's conditions to only select attribute names of those which are "!= null" as set by the distinct filter
- Turned the two functions into a class, since these are related operations
- This change has been made so I can add another function to this class later on
…pping WHERE filter

- This change means that if there's a distinct filter of an attribute, and a WHERE filter specifying a condition of the same attribute, the distinct filter will act as intended (where before the data of that attribute wasn't added to the response)
- This change fixes a bug where if a value used for a WHERE filter with an IN operation had only a single element in a list, it would cause a JPQL error. This happened because the list was converted into a tuple (to satisfy JPQL formatting of array based data). With single element tuples, Python adds a trailing comma, something which is invalid in JPQL. The value is now converted into a string, with the square brackets being replaced with normal brackets before being sent off to JPQL.
- Also added type checking on the value if an IN operation is used to ensure the value is a list
@MRichards99 MRichards99 changed the title Feature/icat distinct filter #141 Implement Distinct Filter for Python ICAT Backend Aug 27, 2020
@MRichards99
Copy link
Collaborator Author

Fixed merge conflict for this PR.

@MRichards99 MRichards99 linked an issue Sep 11, 2020 that may be closed by this pull request
@MRichards99
Copy link
Collaborator Author

In my description regarding this PR, I questioned whether the icat_query class should be in its own file or not. Since working on the include filter, I decided this would be a good idea to move it out (because the class was getting a bit long and was starting to clog up common/icat/helpers.py) and rename the class to ICATQuery too. These changes have been pushed to this branch, as per: 4f84e52

@MRichards99 MRichards99 changed the base branch from feature/icat-limit-skip-filters-#139 to feature/python-icat-where-filter-#142 September 18, 2020 07:51
@MRichards99
Copy link
Collaborator Author

Changed base branch due to merging limit/skip filters into the where filter branch

Copy link
Member

@louise-davies louise-davies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I try to add a where filter to a field which has been specified to be distinct, I get "No results found". This would be required to filter the Diamond Proposal table, which uses distinct filters to get investigation names & titles and you can filter on names and titles as well.

e.g. the following request does not work:

http://localhost:500-/investigations?distinct=["name", "title"]&where={"title": {"like": "dog"}}

@MRichards99
Copy link
Collaborator Author

@louise-davies:
Having had a play in a test Python ICAT script I have, it seems like an issue with Python ICAT. The like operator seems a lot more strict than it does in the DB backend. For example if I have a play with names that have the value of INVESTIGATION 1:

http://{{datagateway-api}}/investigations?where={"name": {"like":"INVESTIGATION 1"}} - this returns 1 result of my test data, as I'd expect
http://{{datagateway-api}}/investigations?where={"name": {"like":"INVESTIGATION"}} - this returns a 404 with "no results found".

You'd have thought the second one is 'like' the piece of data I'm after (i.e. INVESTIGATION 1) and would therefore return some results, but that doesn't seem to be happening.

As for the overarching feature required for Diamond, I do believe this works:
http://{{datagateway-api}}/investigations?distinct=["name", "title"]&where={"name": {"eq":"INVESTIGATION 1"}} - this returns the following which is correct for the test data I have:

[
    {
        "title": "Pay thus factor economic out. Former act and college.\nNumber keep debate trouble box. Area collection quality change. National old focus.\nStrong fall you prepare.",
        "name": "INVESTIGATION 1"
    }
]

As described above, this seems to be a LIKE operator issue. Right now, I'm a little bit stumped about what to do (Python ICAT deals with the LIKE operator itself, I just parse the value from the request), but hopefully I can think of something for next week.

@MRichards99 MRichards99 changed the base branch from feature/python-icat-where-filter-#142 to master September 28, 2020 07:50
@louise-davies
Copy link
Member

@MRichards99 - this might require adding % around all strings passed to the like operator - I know that this is what topcat does to do wildcard searches in it's queries so perhaps that is what is missing. e.g. using like: %dog% vs just like: dog

- This should allow these types of searches to provide more accurate results, where previously none could be found
@MRichards99
Copy link
Collaborator Author

Just made the change and it seems to work! Have a go and see what you think :)

@MRichards99 MRichards99 merged commit e419d8c into master Oct 1, 2020
@MRichards99 MRichards99 deleted the feature/icat-distinct-filter-#141 branch October 1, 2020 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement distinct filter in python-icat
2 participants