Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing utilities for downstream code #28

Open
jl-wynen opened this issue Aug 17, 2022 · 0 comments
Open

Testing utilities for downstream code #28

jl-wynen opened this issue Aug 17, 2022 · 0 comments

Comments

@jl-wynen
Copy link
Collaborator

As it stands, code built on top of pyscicat can only be tested either against an instance of scicat or by mocking requests. The former is too slow for development. The latter means that you lose the abstraction the client provides because users need to know exactly which requests are made.

For example, let's say we have some code that creates a dataset and datablock. But we don't know (or care) exactly how, in what order, or if it does anything else to scicat. We only care about the fact that at the end, there is a dataset and a datablock in scicat. Currently, we have to test it like this:

client = ScicatClient(...)
dataset = ...
datablock = ....
with requests_mock.Mocker() as mock_request:
    set_up_mock(mock_requests)
    a_function_that_uploads_a_dataset(client, dataset, datablock)
   # assert that mock_request.request_history contains the correct requests

A much better solution would be

client = FakeClient()
dataset = ...
datablock = ....
a_function_that_uploads_a_dataset(client, dataset, datablock)
assert dataset in client.__fake_datasets__
assert datablok in client.__fake_datablocks__

This is better because:

  • it requires no knowledge of requests and the API
  • it does not require parsing of URLs and request parameters
  • it still works if the order of operations in a_function_that_uploads_a_dataset changes or it uses different endpoints for some reason

With the changes in #27, FakeClient can be implemented along these lines:

class FakeClient(ScicatClient):
    def __init__(self, datasets):
        super().__init__(base_url='https://does-not-exist', token='123')
        self._datasets = datasets

    @property
    def __fake_datasets__(self):
        return self._datasets

    def _call_endpoint(
            self,
            cmd: str,
            endpoint: str,
            data: BaseModel = None,
            operation: str = "",
            allow_404=False,
        ) -> Optional[dict]:
        if cmd == 'get':
            base, pid = endpoint.split('/')
            if base == 'Datasets':
                return self.__fake_datasets__[pid].dict(exclude_none=True)
            raise NotImplementedError(f'GET with endpoint {endpoint} not supported')
        else:
            raise NotImplementedError(f'Command not supported: {cmd}')

It is not strictly necessary to support all endpoints in the fake. It would already be useful if it only supports the most common ones.

Downsides:

  • increased maintainance burden on pyscicat. The fake needs to be kept up to date as well as the actual client
  • the fake needs to be tested which might be tricky

I think that we need something like this in our code. Otherwise, we cannot really test our usage of pyscicat. It is an option that we develop it entirely on our side. But it would be beneficial to the larger community if the fake was provided by pyscicat itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant