Skip to content

Commit

Permalink
Merge pull request #216 from ral-facilities/bugfix/client-cache-#209
Browse files Browse the repository at this point in the history
Improve Client Handling
  • Loading branch information
MRichards99 authored Apr 26, 2021
2 parents c3c607f + a643145 commit 99450bc
Show file tree
Hide file tree
Showing 17 changed files with 489 additions and 186 deletions.
68 changes: 66 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ sqlalchemy to communicate directly with ICAT's database.
- [Database Backend](#database-backend)
- [Mapped Classes](#mapped-classes)
- [Python ICAT Backend](#python-icat-backend)
- [Client Handling](#client-handling)
- [ICATQuery](#icatquery)
- [Generating the OpenAPI Specification](#generating-the-openapi-specification)
- [Utilities](#utilities)
Expand Down Expand Up @@ -658,11 +659,74 @@ the API supporting multiple authentication mechanisms. Meta attributes such as `
are dealt by Python ICAT, rather than the API.


### Client Handling
Python ICAT uses
[client objects](https://python-icat.readthedocs.io/en/stable/client.html) to
authenticate users and provide interaction to ICAT (e.g. querying icatdb). A client
object has a high creation cost (often taking several seconds), so it's unsuitable to
create a new client object at the start of each request. In a similar vein, it would
also be unsuitable to use a single client object for the entire API due to collisions
between different users.

Client objects are handled using an
[LRU cache](https://docs.python.org/3/library/functools.html#functools.lru_cache),
fetching clients from an [object pool](https://object-pool.readthedocs.io/en/latest/)
when a new client is requested for the cache.

#### Caching
The cache is extended from Cachetools' implementation (although the documentation for
the builtin LRU cache is more detailed, hence that's linked above) to allow for a client
object to be placed back into the object pool once it becomes 'least recently used' and
therefore is removed from the cache (in place of another item). Each cache item is
differentiated by the arguments of the function it's applied to which in this case is
the session ID. The client pool object is also passed into the function, but this is a
singleton object (mandated by the library it's implemented from) so this won't change
throughout the lifetime of the API.

#### Pooling
The object pool has an initial pool size that will be created at startup, and a maximum
size that the pool can grow to if needed, where both values are configurable. The
clients within the pool do not expire and have unlimited reuses, so clients created at
startup can be used for the lifespan of the API. Python ICAT's `Client` class is
extended (to `ICATClient`) to rename `cleanup()` to a function name that the object pool
will recognise to clean up resources and will disable the auto logout feature to prevent
sessions from being logged out when the client is reused.

#### Attributes of the Design
Combining caching and pooling into one design gives the following high-level results.
There is a 1 client to 1 session ID ratio, which will prevent collision between users
and doesn't require an excessive amount of resources (such as a 1 client to 1 request
ratio would). Since the object pool is created at startup, this design can cause the API
to be slow to start as the pool of object needs to be created. A rough guide would be to
multiply the configured initial pool size by around 5 or 6 seconds to get a time
estimate for pool creation.

#### Configuring Client Handling
When configuring the cache size and the client pool, the following should be considered.
The pool's max size should be configured to the maximum number of concurrent users
expected for the API. The cache size must not exceed the pool's maximum size. If
this does happen, the cache could attempt to acquire a client from an empty pool that
cannot grow, causing the request to never respond because the API will wait
indefinitely. The pool's initial size should be configured to strike a balance of
reasonable startup time and not slowing down requests when the pool grows beyond its
initial size. NOTE: when the pool exceeds the initial size and a client is requested by
the cache, a client is created on the fly, so that request (and any others sent before
the client is created and in the cache) WILL be slow. For development, the following
settings (as also set in the example config) would allow for an acceptable startup time
but allow for multiple session IDs to be used if required.

```json
"client_cache_size": 5,
"client_pool_init_size": 2,
"client_pool_max_size": 5,
```


### ICATQuery
The ICATQuery classed is in `datagateway_api.common.icat.query`. This class stores a
query created with Python ICAT
[documentation for the query](https://python-icat.readthedocs.io/en/stable/query.html).
The `execute_query()` function executes the query and returns either results in either a
([documentation](https://python-icat.readthedocs.io/en/stable/query.html)). The
`execute_query()` function executes the query and returns either results in either a
JSON format, or a list of
[Python ICAT entity's](https://python-icat.readthedocs.io/en/stable/entity.html) (this
is defined using the `return_json_formattable` flag). Other functions within that class
Expand Down
4 changes: 3 additions & 1 deletion datagateway_api/common/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ class Backend(ABC):
def login(self, credentials):
"""
Attempt to log a user in using the provided credentials
:param credentials: The user's credentials (including mechanism)
:param credentials: The user's credentials (including mechanism). Credentials
should take the following format in JSON:
{ username: "value", password: "value", mechanism: "value"}
:returns: a session ID
"""
pass
Expand Down
18 changes: 18 additions & 0 deletions datagateway_api/common/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,24 @@ def set_backend_type(self, backend_type):
"""
self.config["backend"] = backend_type

def get_client_cache_size(self):
try:
return self.config["client_cache_size"]
except KeyError:
sys.exit("Missing config value, client_cache_size")

def get_client_pool_init_size(self):
try:
return self.config["client_pool_init_size"]
except KeyError:
sys.exit("Missing config value, client_pool_init_size")

def get_client_pool_max_size(self):
try:
return self.config["client_pool_max_size"]
except KeyError:
sys.exit("Missing config value, client_pool_max_size")

def get_db_url(self):
try:
return self.config["DB_URL"]
Expand Down
32 changes: 16 additions & 16 deletions datagateway_api/common/database/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ class DatabaseBackend(Backend):
Class that contains functions to access and modify data in an ICAT database directly
"""

def login(self, credentials):
def login(self, credentials, **kwargs):
if credentials["username"] == "user" and credentials["password"] == "password":
session_id = str(uuid.uuid1())
insert_row_into_table(
Expand All @@ -48,84 +48,84 @@ def login(self, credentials):
raise AuthenticationError("Username and password are incorrect")

@requires_session_id
def get_session_details(self, session_id):
def get_session_details(self, session_id, **kwargs):
return get_row_by_id(SESSION, session_id).to_dict()

@requires_session_id
def refresh(self, session_id):
def refresh(self, session_id, **kwargs):
return session_id

@requires_session_id
@queries_records
def logout(self, session_id):
def logout(self, session_id, **kwargs):
return delete_row_by_id(SESSION, session_id)

@requires_session_id
@queries_records
def get_with_filters(self, session_id, entity_type, filters):
def get_with_filters(self, session_id, entity_type, filters, **kwargs):
table = get_entity_object_from_name(entity_type)
return get_rows_by_filter(table, filters)

@requires_session_id
@queries_records
def create(self, session_id, entity_type, data):
def create(self, session_id, entity_type, data, **kwargs):
table = get_entity_object_from_name(entity_type)
return create_rows_from_json(table, data)

@requires_session_id
@queries_records
def update(self, session_id, entity_type, data):
def update(self, session_id, entity_type, data, **kwargs):
table = get_entity_object_from_name(entity_type)
return patch_entities(table, data)

@requires_session_id
@queries_records
def get_one_with_filters(self, session_id, entity_type, filters):
def get_one_with_filters(self, session_id, entity_type, filters, **kwargs):
table = get_entity_object_from_name(entity_type)
return get_first_filtered_row(table, filters)

@requires_session_id
@queries_records
def count_with_filters(self, session_id, entity_type, filters):
def count_with_filters(self, session_id, entity_type, filters, **kwargs):
table = get_entity_object_from_name(entity_type)
return get_filtered_row_count(table, filters)

@requires_session_id
@queries_records
def get_with_id(self, session_id, entity_type, id_):
def get_with_id(self, session_id, entity_type, id_, **kwargs):
table = get_entity_object_from_name(entity_type)
return get_row_by_id(table, id_).to_dict()

@requires_session_id
@queries_records
def delete_with_id(self, session_id, entity_type, id_):
def delete_with_id(self, session_id, entity_type, id_, **kwargs):
table = get_entity_object_from_name(entity_type)
return delete_row_by_id(table, id_)

@requires_session_id
@queries_records
def update_with_id(self, session_id, entity_type, id_, data):
def update_with_id(self, session_id, entity_type, id_, data, **kwargs):
table = get_entity_object_from_name(entity_type)
return update_row_from_id(table, id_, data)

@requires_session_id
@queries_records
def get_facility_cycles_for_instrument_with_filters(
self, session_id, instrument_id, filters,
self, session_id, instrument_id, filters, **kwargs,
):
return get_facility_cycles_for_instrument(instrument_id, filters)

@requires_session_id
@queries_records
def get_facility_cycles_for_instrument_count_with_filters(
self, session_id, instrument_id, filters,
self, session_id, instrument_id, filters, **kwargs,
):
return get_facility_cycles_for_instrument_count(instrument_id, filters)

@requires_session_id
@queries_records
def get_investigations_for_instrument_facility_cycle_with_filters(
self, session_id, instrument_id, facilitycycle_id, filters,
self, session_id, instrument_id, facilitycycle_id, filters, **kwargs,
):
return get_investigations_for_instrument_in_facility_cycle(
instrument_id, facilitycycle_id, filters,
Expand All @@ -134,7 +134,7 @@ def get_investigations_for_instrument_facility_cycle_with_filters(
@requires_session_id
@queries_records
def get_investigation_count_instrument_facility_cycle_with_filters(
self, session_id, instrument_id, facilitycycle_id, filters,
self, session_id, instrument_id, facilitycycle_id, filters, **kwargs,
):
return get_investigations_for_instrument_in_facility_cycle_count(
instrument_id, facilitycycle_id, filters,
Expand Down
64 changes: 31 additions & 33 deletions datagateway_api/common/icat/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
from datagateway_api.common.exceptions import AuthenticationError
from datagateway_api.common.helpers import queries_records
from datagateway_api.common.icat.helpers import (
create_client,
create_entities,
delete_entity_by_id,
get_cached_client,
get_count_with_filters,
get_entity_by_id,
get_entity_with_filters,
Expand Down Expand Up @@ -37,9 +37,13 @@ class PythonICATBackend(Backend):
def __init__(self):
pass

def login(self, credentials):
def login(self, credentials, **kwargs):
log.info("Logging in to get session ID")
client = create_client()
client_pool = kwargs.get("client_pool")

# There is no session ID required for this endpoint, a client object will be
# fetched from cache with a blank `sessionId` attribute
client = get_cached_client(None, client_pool)

# Syntax for Python ICAT
login_details = {
Expand All @@ -48,109 +52,103 @@ def login(self, credentials):
}
try:
session_id = client.login(credentials["mechanism"], login_details)
# Flushing client's session ID so the session ID returned in this request
# won't be logged out next time `client.login()` is used in this function.
# `login()` calls `self.logout()` if `sessionId` is set
client.sessionId = None

return session_id
except ICATSessionError:
raise AuthenticationError("User credentials are incorrect")

@requires_session_id
def get_session_details(self, session_id, **kwargs):
log.info("Getting session details for session: %s", session_id)
client = kwargs["client"] if kwargs["client"] else create_client()
return get_session_details_helper(client)
return get_session_details_helper(kwargs.get("client"))

@requires_session_id
def refresh(self, session_id, **kwargs):
log.info("Refreshing session: %s", session_id)
client = kwargs["client"] if kwargs["client"] else create_client()
return refresh_client_session(client)
return refresh_client_session(kwargs.get("client"))

@requires_session_id
@queries_records
def logout(self, session_id, **kwargs):
log.info("Logging out of the Python ICAT client")
client = kwargs["client"] if kwargs["client"] else create_client()
return logout_icat_client(client)
return logout_icat_client(kwargs.get("client"))

@requires_session_id
@queries_records
def get_with_filters(self, session_id, entity_type, filters, **kwargs):
client = kwargs["client"] if kwargs["client"] else create_client()
return get_entity_with_filters(client, entity_type, filters)
return get_entity_with_filters(kwargs.get("client"), entity_type, filters)

@requires_session_id
@queries_records
def create(self, session_id, entity_type, data, **kwargs):
client = kwargs["client"] if kwargs["client"] else create_client()
return create_entities(client, entity_type, data)
return create_entities(kwargs.get("client"), entity_type, data)

@requires_session_id
@queries_records
def update(self, session_id, entity_type, data, **kwargs):
client = kwargs["client"] if kwargs["client"] else create_client()
return update_entities(client, entity_type, data)
return update_entities(kwargs.get("client"), entity_type, data)

@requires_session_id
@queries_records
def get_one_with_filters(self, session_id, entity_type, filters, **kwargs):
client = kwargs["client"] if kwargs["client"] else create_client()
return get_first_result_with_filters(client, entity_type, filters)
return get_first_result_with_filters(kwargs.get("client"), entity_type, filters)

@requires_session_id
@queries_records
def count_with_filters(self, session_id, entity_type, filters, **kwargs):
client = kwargs["client"] if kwargs["client"] else create_client()
return get_count_with_filters(client, entity_type, filters)
return get_count_with_filters(kwargs.get("client"), entity_type, filters)

@requires_session_id
@queries_records
def get_with_id(self, session_id, entity_type, id_, **kwargs):
client = kwargs["client"] if kwargs["client"] else create_client()
return get_entity_by_id(client, entity_type, id_, True)
return get_entity_by_id(kwargs.get("client"), entity_type, id_, True)

@requires_session_id
@queries_records
def delete_with_id(self, session_id, entity_type, id_, **kwargs):
client = kwargs["client"] if kwargs["client"] else create_client()
return delete_entity_by_id(client, entity_type, id_)
return delete_entity_by_id(kwargs.get("client"), entity_type, id_)

@requires_session_id
@queries_records
def update_with_id(self, session_id, entity_type, id_, data, **kwargs):
client = kwargs["client"] if kwargs["client"] else create_client()
return update_entity_by_id(client, entity_type, id_, data)
return update_entity_by_id(kwargs.get("client"), entity_type, id_, data)

@requires_session_id
@queries_records
def get_facility_cycles_for_instrument_with_filters(
self, session_id, instrument_id, filters, **kwargs,
):
client = kwargs["client"] if kwargs["client"] else create_client()
return get_facility_cycles_for_instrument(client, instrument_id, filters)
return get_facility_cycles_for_instrument(
kwargs.get("client"), instrument_id, filters,
)

@requires_session_id
@queries_records
def get_facility_cycles_for_instrument_count_with_filters(
self, session_id, instrument_id, filters, **kwargs,
):
client = kwargs["client"] if kwargs["client"] else create_client()
return get_facility_cycles_for_instrument_count(client, instrument_id, filters)
return get_facility_cycles_for_instrument_count(
kwargs.get("client"), instrument_id, filters,
)

@requires_session_id
@queries_records
def get_investigations_for_instrument_facility_cycle_with_filters(
self, session_id, instrument_id, facilitycycle_id, filters, **kwargs,
):
client = kwargs["client"] if kwargs["client"] else create_client()
return get_investigations_for_instrument_in_facility_cycle(
client, instrument_id, facilitycycle_id, filters,
kwargs.get("client"), instrument_id, facilitycycle_id, filters,
)

@requires_session_id
@queries_records
def get_investigation_count_instrument_facility_cycle_with_filters(
self, session_id, instrument_id, facilitycycle_id, filters, **kwargs,
):
client = kwargs["client"] if kwargs["client"] else create_client()
return get_investigations_for_instrument_in_facility_cycle_count(
client, instrument_id, facilitycycle_id, filters,
kwargs.get("client"), instrument_id, facilitycycle_id, filters,
)
Loading

0 comments on commit 99450bc

Please sign in to comment.