Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add end to end tests for zarr streaming #7997

Open
MichaelBuessemeyer opened this issue Aug 15, 2024 · 3 comments · May be fixed by #8137
Open

Add end to end tests for zarr streaming #7997

MichaelBuessemeyer opened this issue Aug 15, 2024 · 3 comments · May be fixed by #8137
Assignees

Comments

@MichaelBuessemeyer
Copy link
Contributor

Detailed Description

Zarr streaming is a crucial feature for our pipelines. Thus, we should ensure that this feature is tested to avoid easily catchable mistakes breaking the streaming mechanism. Therefore, I suggest to write end to end tests testing all streaming routes, especially the ones that directly request an array

@frcroth
Copy link
Member

frcroth commented Oct 9, 2024

This seems to be pretty complicated. Some thoughts:

End to end tests

  • End to end tests only interact with the webknossos API at the moment. There are no tests for the datastore. As I understand there is just no datastore when running e2e tests.
  • There are no datasets in the test. The database contains datasets but no data. So there would need to be a test dataset included in the setup. Not yet sure how to accomplish this.

That is why I investigated using the backend tests.

  • The dataset problematic remains
  • If we want to test the actual routes, we need to put the controller under test. This requires mocking a lot of services which may be quite a lot of effort.

So I think if we want to test the logic, it would be best to use a backend test that tests the content of the controllers and not the controller methods themselves. The necessity of including a dataset remains.

@MichaelBuessemeyer
Copy link
Contributor Author

My thoughts:

There are no datasets in the test. The database contains datasets but no data. So there would need to be a test dataset included in the setup. Not yet sure how to accomplish this.

What about using a remote dataset for this? This could be configured to be served by the datastore and then be streamed as zarr for some kind of snapshot tests 🤔

@fm3
Copy link
Member

fm3 commented Oct 11, 2024

I think e2e tests would be nice here, in order to avoid all the mocking and to test what is actually ultimately used.

I’m not super familiar with how the e2e tests work, but it should be possible to launch a datastore module alongside the wk core backend module, and then adding it to the datastores postgres table. Of course, this may slow down the CI again, but if we want tests for this stuff, I think that’s what we need. Same for #8099

Including a small sample dataset in the repository seems acceptable to me. I’d rather avoid relying on remote data unless really necessary, because problems with the remote availability would make our tests flaky, and possibly slow. This is somewhat expected in unit tests like in the DataVaultTestSuite, but for outbound zarr streaming, I’d prefer to isolate that. What do you think?

Maybe @normanrz also has an opinion on these matters?

@frcroth frcroth linked a pull request Oct 21, 2024 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants