Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support both file_secret and secret_id based paths #29

Merged
merged 16 commits into from
Oct 31, 2023

Conversation

mephenor
Copy link
Member

@mephenor mephenor commented Oct 26, 2023

Old code paths and related classes dealing with output_metadata including the plain text file secret are now prefixed with legacy.
Refactored s3_upload into a subpackage to keep some semblance of sanity.
Added missing object storage cleanup when checksum validation fails after re-downloading file for validation.
Added secret_ingest URL and public key options to support file secret exchange directly after upload.
Token for communication on that path is read from the same location as for the file ingest functionality before, i.e. the same token is used for both new FIS endpoints.

Please tell me if there are some tests missing that should be included, lost quite a bit of time here and probably missed one or two smaller issues.

As this repo does not track the microservice template, we probably should also have a refactoring backlog entry to move this to ruff, pydantic v2 and our pip lock setup.

@coveralls
Copy link

coveralls commented Oct 26, 2023

Pull Request Test Coverage Report for Build 6696361678

  • 409 of 493 (82.96%) changed or added relevant lines in 13 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-1.1%) to 74.723%

Changes Missing Coverage Covered Lines Changed/Added Lines %
ghga_datasteward_kit/batch_s3_upload.py 1 2 50.0%
ghga_datasteward_kit/models.py 55 56 98.21%
ghga_datasteward_kit/s3_upload/config.py 22 23 95.65%
ghga_datasteward_kit/cli/file.py 4 6 66.67%
ghga_datasteward_kit/file_ingest.py 13 15 86.67%
ghga_datasteward_kit/s3_upload/file_decryption.py 42 45 93.33%
ghga_datasteward_kit/s3_upload/downloader.py 36 40 90.0%
ghga_datasteward_kit/s3_upload/file_encryption.py 44 49 89.8%
ghga_datasteward_kit/s3_upload/uploader.py 54 59 91.53%
ghga_datasteward_kit/s3_upload/entrypoint.py 48 74 64.86%
Files with Coverage Reduction New Missed Lines %
ghga_datasteward_kit/batch_s3_upload.py 1 28.21%
Totals Coverage Status
Change from base Build 5833437939: -1.1%
Covered Lines: 674
Relevant Lines: 902

💛 - Coveralls

Cito
Cito previously approved these changes Oct 27, 2023
Copy link
Member

@Cito Cito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could not really check in depth.

ghga_datasteward_kit/s3_upload/config.py Show resolved Hide resolved
ghga_datasteward_kit/s3_upload/downloader.py Outdated Show resolved Hide resolved
ghga_datasteward_kit/s3_upload/entrypoint.py Outdated Show resolved Hide resolved
Cito
Cito previously approved these changes Oct 27, 2023
Copy link
Contributor

@KerstenBreuer KerstenBreuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this was a massive rewrite. I could just do a shallow review but did not found major issues. However, would need much more time to go into more depth. Could you maybe help by highlighting which parts were rewritten and which parts mostly moved into different files?

Regarding tests, since this is becoming a quite massive codebase that is difficult to keep an overview (and to help answer your question about whether everything was tested correctly), maybe it would be good to try to approach 100% coverage. Some parts really do not need coverage, you can ignore them by configuring the coverage tools.

What do you think, would 100% coverage give you a better feel?

ghga_datasteward_kit/__init__.py Outdated Show resolved Hide resolved
ghga_datasteward_kit/cli/file.py Outdated Show resolved Hide resolved
ghga_datasteward_kit/file_ingest.py Outdated Show resolved Hide resolved
ghga_datasteward_kit/models.py Outdated Show resolved Hide resolved
ghga_datasteward_kit/s3_upload/entrypoint.py Outdated Show resolved Hide resolved
ghga_datasteward_kit/s3_upload/entrypoint.py Show resolved Hide resolved
ghga_datasteward_kit/s3_upload/entrypoint.py Outdated Show resolved Hide resolved
ghga_datasteward_kit/s3_upload/utils.py Outdated Show resolved Hide resolved
ghga_datasteward_kit/s3_upload/utils.py Show resolved Hide resolved
@mephenor
Copy link
Member Author

mephenor commented Oct 30, 2023

All requested changes have been addressed in the lastest commits.
Instead of dealing directly with object storage cleanup a StorageCleaner context manager is passed through.
Two error cases that previously did not lead to the deletion of an unusable uploaded file are now also covered: Failures during file part download and writing the corresponding output file.

@mephenor mephenor removed the request for review from TheByronHimes October 30, 2023 16:25
Copy link
Contributor

@KerstenBreuer KerstenBreuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice thank you

@mephenor mephenor merged commit 626534e into main Oct 31, 2023
3 checks passed
@mephenor mephenor deleted the secret_handling_changes_GSI-360 branch October 31, 2023 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants