Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Celery job processing, routes, & BatchAggregation lib #783

Merged
merged 26 commits into from
May 28, 2024

Conversation

zwolf
Copy link
Member

@zwolf zwolf commented Mar 12, 2024

This PR adds Celery/Redis based background job processing to aggregation. It includes routes to begin aggregation when a request is made from Panoptes directly and their specs. See cards on https://github.com/orgs/zooniverse/projects/44/ for details.

It also adds a BatchAggregation lib that interacts with Panoptes to begin to manage the full workflow. This code works when dropped into a console and as a background Celery job, but has no tests yet.

I'd love some feedback on what's here so far. I'm gonna message you directly, @CKrawczyk, and I'm opening this as a draft PR pointed at a branch off of master so that it can be tested in isolation. Also planning on a separate deploy to ensure things work as expected but I removed it from this PR for ease of viewing.

Copy link

@hound hound bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some files could not be reviewed due to errors:

./panoptes_aggregation/batch_aggregation.py:11: [F401] 'panoptes_client.panop...
./panoptes_aggregation/batch_aggregation.py:11: [F401] 'panoptes_client.panoptes.PanoptesAPIException' imported but unused
./panoptes_aggregation/batch_aggregation.py:20: [E302] expected 2 blank lines, found 1
./panoptes_aggregation/batch_aggregation.py:23: [F841] local variable 'exports' is assigned to but never used
./panoptes_aggregation/batch_aggregation.py:24: [F841] local variable 'wf_df' is assigned to but never used
./panoptes_aggregation/batch_aggregation.py:31: [F841] local variable 'reduced_data' is assigned to but never used
./panoptes_aggregation/batch_aggregation.py:33: [E302] expected 2 blank lines, found 1
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:2: [...
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:2: [F401] 'panoptes_aggregation.batch_aggregation.run_aggregation' imported but unused

Copy link
Collaborator

@CKrawczyk CKrawczyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just left my initial comments on the PR draft. I have yet to test it all locally.

Comment on lines 6 to 8
def test_save_exports(self):
# Test that Panoptes calls are made and files are saved
assert 1 == 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will likely want to patch/mock the Paonoptes class and check the expected export calls are made when the function is run.

Comment on lines 10 to 15
def test_process_wf_export(self):
# Test that:
# the wf export is parsed
# the version instance vars are set
# dataframe is retuned
assert 1 == 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A patch can be used to set a fixed input file stream and the test can check that it was parsed as expected. This can either be defined in the test file (similar to the extractor and reducer tests), or an example .csv file can be placed in the test folder and read in (might be easier to debug and read in the long run).

panoptes_aggregation/tests/router_tests/test_routes.py Outdated Show resolved Hide resolved
panoptes_aggregation/batch_aggregation.py Show resolved Hide resolved
panoptes_aggregation/batch_aggregation.py Show resolved Hide resolved
Copy link

@hound hound bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some files could not be reviewed due to errors:

./panoptes_aggregation/batch_aggregation.py:11: [F401] 'panoptes_client.panop...
./panoptes_aggregation/batch_aggregation.py:11: [F401] 'panoptes_client.panoptes.PanoptesAPIException' imported but unused
./panoptes_aggregation/batch_aggregation.py:20: [E302] expected 2 blank lines, found 1
./panoptes_aggregation/batch_aggregation.py:23: [F841] local variable 'exports' is assigned to but never used
./panoptes_aggregation/batch_aggregation.py:24: [F841] local variable 'wf_df' is assigned to but never used
./panoptes_aggregation/batch_aggregation.py:31: [F841] local variable 'reduced_data' is assigned to but never used
./panoptes_aggregation/batch_aggregation.py:33: [E302] expected 2 blank lines, found 1
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:2: [...
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:2: [F401] 'unittest.mock.Mock' imported but unused
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:7: [E302] expected 2 blank lines, found 1
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:28: [E201] whitespace after '['
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:28: [E202] whitespace before ']'
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:28: [E202] whitespace before '}'

Copy link

@hound hound bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some files could not be reviewed due to errors:

./panoptes_aggregation/batch_aggregation.py:22: [E302] expected 2 blank lines...
./panoptes_aggregation/batch_aggregation.py:22: [E302] expected 2 blank lines, found 1
./panoptes_aggregation/batch_aggregation.py:27: [F841] local variable 'wf_df' is assigned to but never used
./panoptes_aggregation/batch_aggregation.py:54: [E302] expected 2 blank lines, found 1
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:2: [...
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:2: [F401] 'unittest.mock.Mock' imported but unused
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:10: [E302] expected 2 blank lines, found 1
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:19: [E201] whitespace after '{'
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:19: [E202] whitespace before '}'
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:52: [E201] whitespace after '['
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:52: [E202] whitespace before ']'
./panoptes_aggregation/tests/batch_aggregation/test_batch_aggregation.py:52: [E202] whitespace before '}'

@zwolf zwolf merged commit 71528fe into batch-aggregation May 28, 2024
8 checks passed
@zwolf zwolf mentioned this pull request May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants