-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨(cli) add database backend switch #200
✨(cli) add database backend switch #200
Conversation
attachments: Optional[bool] = False | ||
ascending: Optional[bool] = False | ||
search_after: Optional[str] = None | ||
pit_id: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using Pydantic models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems query parameters are already validated by FastAPI, I would like to avoid validating them twice
def query_statements(self, params: StatementParameters) -> StatementQueryResult: | ||
"""Returns the results of a statements query using xAPI parameters.""" | ||
|
||
raise Exception("TODO") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise Exception("TODO") | |
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good Idea) Thanks) these methods will be implemented in this pull request)
src/ralph/cli.py
Outdated
reload=True, | ||
) | ||
|
||
with NamedTemporaryFile(mode="w+", encoding="utf-8") as temp_env_file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why opening this file in w+
mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well spotted! Thanks) Indeed, just w
mode is enough) I thought w+
would be needed for reading in this case, but it's not true)
60c9963
to
670a241
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
@abstractmethod | ||
def query_statements_by_ids(self, ids: list[str]) -> list: | ||
"""Returns the list of matching statement IDs from the database.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not modifying the BaseDatabase
model instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good Idea) Initially, I wanted to keep these methods separate as they are specific to the API route, however, putting them in BaseDatabase indeed simplifies a lot) Thanks) Updated this part)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think it will be easier to maintain. 👍
|
||
return self.client.search( # pylint: disable=unexpected-keyword-arg | ||
body={"query": {"terms": {"_id": ids}}} | ||
)["hits"]["hits"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And thus why not implementing this in the ESDatabase
class?
def query_statements_by_ids(self, ids: list[str]) -> list: | ||
"""Returns the list of matching statement IDs from the database.""" | ||
|
||
return list(self.collection.find(filter={"_source.id": {"$in": ids}})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there! Keep it up! 💪
src/ralph/api/routers/statements.py
Outdated
|
||
|
||
@router.get("/") | ||
# pylint: disable=too-many-arguments, too-many-locals | ||
async def get( | ||
request: Request, | ||
backend: BaseDatabase = Depends(get_backend), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this backend parameter should be a global setting instead, see related documentation: https://fastapi.tiangolo.com/advanced/settings/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point) Could we move to pydantic settings management (#168) in a separate pull request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure we can!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this part) We now use a global DATABASE_CLIENT
variable in statements.py
, which retrieves the database backend from pydantic settings.
src/ralph/api/routers/statements.py
Outdated
success_count = ES_CLIENT.put(statements_dict.values(), ignore_errors=False) | ||
except BulkIndexError as exc: | ||
success_count = backend.put(statements_dict.values(), ignore_errors=False) | ||
except (BulkIndexError, BulkWriteError, BadFormatException) as exc: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should catch backend-specific errors in the backend class instead and raise our own exceptions that will be catched here. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good Idea) Thanks) updated this part in the last commit)
|
||
database = getattr(mongo_client, MONGO_TEST_DATABASE) | ||
collection = getattr(database, MONGO_TEST_COLLECTION) | ||
collection.insert_many(list(MongoDatabase.to_documents(statements))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big fan of this trick 😉 Why not having a single insert_statements(client, statements)
and test if the client
is a mongo or ES client instance to switch between methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great Idea) Indeed, this isn't ideal, I tried to put the fixture as a parameter and stumbled on an pytest issue( However, it suggests a workaround using a parametrized fixture. Updated the tests with their approach)
tests/api/test_statements_get.py
Outdated
@pytest.mark.parametrize( | ||
"get_backend_override,insert_statements", | ||
[ | ||
(get_backend_override_with_elasticsearch, insert_es_statements), | ||
(get_backend_override_with_mongodb, insert_mongo_statements), | ||
], | ||
) | ||
def test_api_statements_get_statements( | ||
get_backend_override, insert_statements, auth_credentials, es, mongo | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pytest.mark.parametrize( | |
"get_backend_override,insert_statements", | |
[ | |
(get_backend_override_with_elasticsearch, insert_es_statements), | |
(get_backend_override_with_mongodb, insert_mongo_statements), | |
], | |
) | |
def test_api_statements_get_statements( | |
get_backend_override, insert_statements, auth_credentials, es, mongo | |
): | |
@pytest.mark.parametrize( | |
"get_backend_override,client", | |
[ | |
(get_backend_override_with_elasticsearch, es), | |
(get_backend_override_with_mongodb, mongo), | |
], | |
) | |
def test_api_statements_get_statements( | |
get_backend_override, client, auth_credentials, | |
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/ralph/defaults.py
Outdated
# Backends | ||
|
||
DATABASE_BACKENDS = [backend.value for backend in DatabaseBackends] | ||
PARSERS = [parser.value for parser in Parsers] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you place it elsewhere as it is not backends ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good Idea! Thanks) This has been updated in the #204 PR)
src/ralph/api/routers/statements.py
Outdated
|
||
envvar = f"{ENVVAR_PREFIX}_{backend_class.name}_{parameter.name}".upper() | ||
value = config(envvar, None) | ||
if value is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure of what I will write is trivial or not. But are we sure that when a parameter has the None value in the configuration but is requested, it won't fail as it will be missing in options
value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! Thanks) This part has now changed, we use pydantic settings for configuration and made all backend init arguments optional (by using defaults set by configuration).
One problem might arise if the user intends to pass None
explicitly for a cli option value to overwrite a default, however, I'm not sure whether this is permitted in our current implementation.
0746099
to
254ca36
Compare
Although the host and port of the ralph server are already configurable using environment variables, it might be useful to be able to pass these values directly on the command line.
We want to be able to switch between different database backends using CLI arguments or environment variables in the ralph LRS API.
254ca36
to
7718603
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that Ralph supports two database backends for its LRS API, to simplify catching various exceptions thrown by the backends during querying, we choose to wrap them as BackendExceptions.
263e715
to
5cc2c43
Compare
Purpose
We want to be able to switch between different database backends using CLI arguments, environment variables, or configuration values in the ralph LRS API.
Proposal