Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Skip/Limit Filters for Python ICAT Backend #160

Conversation

MRichards99
Copy link
Collaborator

This PR will close #139.

This PR adds the functionality of limit and skip filters in get by filter requests. These two filters can be used on their own in a request, or they can be used together. When they're used together, the skip filter is merged into the limit filter, with the skip filter being removed from the filter handler.

When using the skip filter on its own, the maximum number of entities from ICAT properties is used. Stuart and I discovered these properties can get grabbed from ICAT_URL/icat/properties or from client.getProperties() within Python ICAT. The docstring within common.config.get_icat_properties explains why I send a HTTP request rather than using Python ICAT to get this data.

All unit tests that passed before this change, still pass. The 4 failures are related to /sessions endpoints and will be fixed when I adapt the unit tests to run on the Python ICAT backend.

- The skip filter will be merged into the limit filter, with the skip filter being removed from the filter handler
- There is a bug when using a skip filter on it's own, as it'll always return a 404
- Grabs max num of entities from ICAT properties and applies that to the count element of the tuple - the currently configured number is 10000, which should be more than enough for any applicable use cases of this API
@MRichards99 MRichards99 linked an issue Aug 21, 2020 that may be closed by this pull request
@MRichards99
Copy link
Collaborator Author

Just resolved the merge conflicts with this PR

@louise-davies
Copy link
Member

I think this now needs to point at the where branch, as the order branch has already been merged in so it needs to go in with the where PR

@MRichards99 MRichards99 changed the base branch from feature/icat-order-filter-#140 to feature/python-icat-where-filter-#142 September 2, 2020 08:38
@MRichards99
Copy link
Collaborator Author

@louise-davies:

I think this now needs to point at the where branch, as the order branch has already been merged in so it needs to go in with the where PR

Good spot! Just updated it :)

Copy link
Member

@louise-davies louise-davies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, however, when testing, if I just specify a skip filter I get the following error:

[2020-09-03 10:15:23,749] {helpers:helpers.py:get_session_id_from_auth_header:56} INFO - Getting session Id from auth header  
[2020-09-03 10:15:23,750] {helpers:helpers.py:get_filters_from_query_string:94} INFO - Getting filters from query string  
[2020-09-03 10:15:23,842] {helpers:helpers.py:wrapper_requires_session:43} INFO -Session time: 118  
[2020-09-03 10:15:24,009] {helpers:helpers.py:wrapper_gets_records:39} ERROR -Expecting value: line 1 column 1 (char 0)  
Traceback (most recent call last):
  File "C:\code\datagateway-api\common\helpers.py", line 34, in wrapper_gets_records
    return method(*args, **kwargs)
  File "C:\code\datagateway-api\common\icat\backend.py", line 77, in get_with_filters
    return get_entity_with_filters(self.client, table.__name__, filters)
  File "C:\code\datagateway-api\common\icat\helpers.py", line 388, in get_entity_with_filters
    filter_handler.apply_filters(query)
  File "C:\code\datagateway-api\common\filter_order_handler.py", line 34, in apply_filters
    filter.apply_filter(query)
  File "C:\code\datagateway-api\common\icat\filters.py", line 107, in apply_filter
    icat_properties = config.get_icat_properties()
  File "C:\code\datagateway-api\common\config.py", line 83, in get_icat_properties
    icat_properties = r.json()
  File "C:\Users\mnf98541\anaconda3\lib\site-packages\requests\models.py", line 898, in json
    return complexjson.loads(self.text, **kwargs)
  File "C:\Users\mnf98541\anaconda3\lib\json\__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "C:\Users\mnf98541\anaconda3\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\mnf98541\anaconda3\lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
[2020-09-03 10:15:24,013] {_internal:_internal.py:_log:122} INFO -127.0.0.1 - - [03/Sep/2020 10:15:24] "GET /investigations?skip=1 HTTP/1.1" 400 -  

limit on it's own and skip and limit together are working fine for me though

@MRichards99
Copy link
Collaborator Author

@louise-davies I can't seem to recreate what you're seeing, but I believe it's being caused by a request being sent to ICAT to get some config properties (specifically converting the response to JSON/dict). What version of ICAT are you on (I'm on 4.9.1)? What happens if you send a request to YOUR_ICAT_URL/icat/properties? This is the response if I make a request to that URL:

{"maxEntities":10000,"lifetimeMinutes":120,"authenticators":[{"mnemonic":"simple","keys":[{"name":"username"},{"name":"password","hide":true}],"friendly":"Simple"}],"containerType":"Glassfish"}

@louise-davies
Copy link
Member

louise-davies commented Sep 3, 2020

I'm using the ICAT on scigateway-preprod, so https://scigateway-preprod.esc.rl.ac.uk:8181/icat/properties is the URL. The ICAT version I'm using is 4.10.0

My response is: {"maxEntities":10000,"lifetimeMinutes":120,"authenticators":[{"mnemonic":"simple","keys":[{"name":"username"},{"name":"password","hide":true}],"friendly":"Test authenticator"},{"mnemonic":"anon","keys":[],"friendly":"Anonymous"}],"containerType":"Glassfish"}

Oddly, I can query it fine in a Python REPL - I shall investigate more

@louise-davies
Copy link
Member

Ah, I realise that I have my ICAT_URL set to https://scigateway-preprod.esc.rl.ac.uk:8181/ICATService/ICAT?wsdl - which is why the request will fail. However, I get xml.sax._exceptions.SAXParseException: <unknown>:1:0: syntax error errors when I try to just give the base URL and not specifically the wsdl...

@MRichards99
Copy link
Collaborator Author

I get the same syntax error (on startup) if I use https://scigateway-preprod.esc.rl.ac.uk:8181/ but it seems to work (including an individual skip filter) if I use https://scigateway-preprod.esc.rl.ac.uk:8181

@louise-davies
Copy link
Member

Ah, just checked my python-icat version, and looking at Rolf's changelog is this: icatproject/python-icat#63

My bad - I'll upgrade python-icat and I presume everything will work fine!

@MRichards99
Copy link
Collaborator Author

For reference, pip freeze tells me I'm using 0.17.0 version of Python ICAT

common/config.py Outdated
Comment on lines 75 to 87
def get_icat_properties(self):
"""
ICAT properties can be retrieved using Python ICAT's client object, however this
requires the client object to be authenticated which may not always be the case
when requesting these properties, hence a HTTP request is sent as an alternative
"""
properties_url = f"{config.get_icat_url()}/icat/properties"
r = requests.request("GET", properties_url, verify=config.get_icat_check_cert())
icat_properties = r.json()
log.debug("ICAT Properties: %s", icat_properties)

return icat_properties

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a way to save this so that the request doesn't have to be sent each time a user sends a skip request - this type of information will only change rarely and so could probably be done on start up somehow/via a singleton pattern.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a good point - I might put this code in constants.py because that should only run once at start up and then I can just call Constants.icat_properties whenever it's needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just pushed a change for this. If you put a print statement in get_icat_properties(), I saw it was being executed twice at start-up, and not at all when you send a GET request with a skip filter. It needs to be a print statement because I found get_icat_properties() runs before the logger has been set up.

@louise-davies
Copy link
Member

I was using python-icat-0.16.0, upgrading to 0.17 fixed the issue I was having :)

@MRichards99 MRichards99 merged commit 04db3d5 into feature/python-icat-where-filter-#142 Sep 18, 2020
@MRichards99 MRichards99 deleted the feature/icat-limit-skip-filters-#139 branch October 1, 2020 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement limit and skip filter for python-icat
2 participants