Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨(cli) add database backend switch #200

Merged
merged 3 commits into from
Aug 26, 2022

Conversation

SergioSim
Copy link
Collaborator

@SergioSim SergioSim commented Jun 24, 2022

Purpose

We want to be able to switch between different database backends using CLI arguments, environment variables, or configuration values in the ralph LRS API.

Proposal

  • Refactor ES Backend usage in the API routes
  • Add MongoDB support
  • Add Tests

attachments: Optional[bool] = False
ascending: Optional[bool] = False
search_after: Optional[str] = None
pit_id: Optional[str] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using Pydantic models?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems query parameters are already validated by FastAPI, I would like to avoid validating them twice

def query_statements(self, params: StatementParameters) -> StatementQueryResult:
"""Returns the results of a statements query using xAPI parameters."""

raise Exception("TODO")
Copy link
Contributor

@jmaupetit jmaupetit Jun 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise Exception("TODO")
raise NotImplementedError

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Idea) Thanks) these methods will be implemented in this pull request)

src/ralph/cli.py Outdated
reload=True,
)

with NamedTemporaryFile(mode="w+", encoding="utf-8") as temp_env_file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why opening this file in w+ mode?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted! Thanks) Indeed, just w mode is enough) I thought w+ would be needed for reading in this case, but it's not true)

@SergioSim SergioSim force-pushed the add-database-backend-switch branch 2 times, most recently from 60c9963 to 670a241 Compare June 28, 2022 14:50
@SergioSim SergioSim marked this pull request as ready for review June 28, 2022 14:57
Copy link
Contributor

@jmaupetit jmaupetit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/ralph/cli.py Outdated Show resolved Hide resolved
src/ralph/cli.py Outdated Show resolved Hide resolved
src/ralph/cli.py Show resolved Hide resolved

@abstractmethod
def query_statements_by_ids(self, ids: list[str]) -> list:
"""Returns the list of matching statement IDs from the database."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not modifying the BaseDatabase model instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Idea) Initially, I wanted to keep these methods separate as they are specific to the API route, however, putting them in BaseDatabase indeed simplifies a lot) Thanks) Updated this part)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think it will be easier to maintain. 👍


return self.client.search( # pylint: disable=unexpected-keyword-arg
body={"query": {"terms": {"_id": ids}}}
)["hits"]["hits"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And thus why not implementing this in the ESDatabase class?

def query_statements_by_ids(self, ids: list[str]) -> list:
"""Returns the list of matching statement IDs from the database."""

return list(self.collection.find(filter={"_source.id": {"$in": ids}}))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

@jmaupetit jmaupetit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! Keep it up! 💪

CHANGELOG.md Outdated Show resolved Hide resolved


@router.get("/")
# pylint: disable=too-many-arguments, too-many-locals
async def get(
request: Request,
backend: BaseDatabase = Depends(get_backend),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this backend parameter should be a global setting instead, see related documentation: https://fastapi.tiangolo.com/advanced/settings/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point) Could we move to pydantic settings management (#168) in a separate pull request?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure we can!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this part) We now use a global DATABASE_CLIENT variable in statements.py, which retrieves the database backend from pydantic settings.

success_count = ES_CLIENT.put(statements_dict.values(), ignore_errors=False)
except BulkIndexError as exc:
success_count = backend.put(statements_dict.values(), ignore_errors=False)
except (BulkIndexError, BulkWriteError, BadFormatException) as exc:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should catch backend-specific errors in the backend class instead and raise our own exceptions that will be catched here. WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Idea) Thanks) updated this part in the last commit)

src/ralph/backends/database/base.py Outdated Show resolved Hide resolved
src/ralph/backends/database/base.py Outdated Show resolved Hide resolved
src/ralph/backends/database/base.py Outdated Show resolved Hide resolved
src/ralph/backends/database/mongo.py Show resolved Hide resolved
tests/api/test_statements.py Outdated Show resolved Hide resolved

database = getattr(mongo_client, MONGO_TEST_DATABASE)
collection = getattr(database, MONGO_TEST_COLLECTION)
collection.insert_many(list(MongoDatabase.to_documents(statements)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of this trick 😉 Why not having a single insert_statements(client, statements) and test if the client is a mongo or ES client instance to switch between methods?

Copy link
Collaborator Author

@SergioSim SergioSim Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Idea) Indeed, this isn't ideal, I tried to put the fixture as a parameter and stumbled on an pytest issue( However, it suggests a workaround using a parametrized fixture. Updated the tests with their approach)

Comment on lines 56 to 73
@pytest.mark.parametrize(
"get_backend_override,insert_statements",
[
(get_backend_override_with_elasticsearch, insert_es_statements),
(get_backend_override_with_mongodb, insert_mongo_statements),
],
)
def test_api_statements_get_statements(
get_backend_override, insert_statements, auth_credentials, es, mongo
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@pytest.mark.parametrize(
"get_backend_override,insert_statements",
[
(get_backend_override_with_elasticsearch, insert_es_statements),
(get_backend_override_with_mongodb, insert_mongo_statements),
],
)
def test_api_statements_get_statements(
get_backend_override, insert_statements, auth_credentials, es, mongo
):
@pytest.mark.parametrize(
"get_backend_override,client",
[
(get_backend_override_with_elasticsearch, es),
(get_backend_override_with_mongodb, mongo),
],
)
def test_api_statements_get_statements(
get_backend_override, client, auth_credentials,
):

Copy link
Contributor

@quitterie-lcs quitterie-lcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/ralph/backends/database/mongo.py Outdated Show resolved Hide resolved
# Backends

DATABASE_BACKENDS = [backend.value for backend in DatabaseBackends]
PARSERS = [parser.value for parser in Parsers]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you place it elsewhere as it is not backends ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Idea! Thanks) This has been updated in the #204 PR)

tests/api/test_statements.py Outdated Show resolved Hide resolved

envvar = f"{ENVVAR_PREFIX}_{backend_class.name}_{parameter.name}".upper()
value = config(envvar, None)
if value is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure of what I will write is trivial or not. But are we sure that when a parameter has the None value in the configuration but is requested, it won't fail as it will be missing in options value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Thanks) This part has now changed, we use pydantic settings for configuration and made all backend init arguments optional (by using defaults set by configuration).
One problem might arise if the user intends to pass None explicitly for a cli option value to overwrite a default, however, I'm not sure whether this is permitted in our current implementation.

@SergioSim SergioSim force-pushed the add-database-backend-switch branch 6 times, most recently from 0746099 to 254ca36 Compare August 17, 2022 10:06
@SergioSim SergioSim changed the title Add database backend switch ✨(cli) add database backend switch Aug 17, 2022
Although the host and port of the ralph server are already configurable
using environment variables, it might be useful to be able to pass
these values directly on the command line.
We want to be able to switch between different database backends
using CLI arguments or environment variables in the ralph LRS API.
@SergioSim SergioSim force-pushed the add-database-backend-switch branch from 254ca36 to 7718603 Compare August 25, 2022 12:52
Copy link
Contributor

@jmaupetit jmaupetit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that Ralph supports two database backends for its LRS API,
to simplify catching various exceptions thrown by the backends
during querying, we choose to wrap them as BackendExceptions.
@SergioSim SergioSim force-pushed the add-database-backend-switch branch from 263e715 to 5cc2c43 Compare August 26, 2022 14:03
@SergioSim SergioSim merged commit f595adc into openfun:master Aug 26, 2022
@SergioSim SergioSim deleted the add-database-backend-switch branch August 26, 2022 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants