Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow: Completion Service sample data / metadata #487

Merged
merged 14 commits into from
Mar 25, 2024

Conversation

JeffreyThiessen
Copy link
Member

@JeffreyThiessen JeffreyThiessen commented Mar 21, 2024

What does this PR do and why?

Describe in detail what your merge request does and why.
Fixes #255

  • Modifies WorkflowExecutions:CompletionService to include
    • sample output files on associated SamplesWorkflowExecutions
    • sample output metadata on associated SamplesWorkflowExecutions
    • major refactor to break logic into smaller functions
  • Updates samples_workflow_executions table to include metadata column as jsonb
  • Updates SamplesWorkflowExecution model to have outputs as :attachable

Note: saving attachments/metadata will be in this issue #495
Note: setting up completion jobs and reworking workflow execution states will be in this issue #437
Note: This PR assumes the iridanext.output.json.gz file will always be formed correctly and the samples/samples_workflow_executions referenced exist.
iridanext.output.json.gz integrety validation will be handled in this issue: #497

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other pull requests.

How to set up and validate locally

Note: When testing with ga4gh connection + local DB for blob storage, the ga4gh_wes connection will not write files back to your db, so you must manually place the files where they are expected to be. If you are using Azure for blob storage with ga4gh_wes hosted on Azure, this will be handled for you like in prod.

  1. Run a workflow execution via GUI
  • you should upload a paired end to a new sample and use that sample as your workflow input
    image
  1. wait for state to become completed
    image

  2. Get your WorkflowExecution in rails console

rails c
# Workflow Execution
wfe = WorkflowExecution.last
  1. Local DB step: get your output directory
# This is where the workflow execution data should be
wfe.workflow_params['--outdir']

# example:
# "/home/jthiessen/git/irida-next-core/storage/6e/zc/6ezc87o51ym3y4xtljonlucelzuz/output/"

You will need to create the /output/ sub directory to place files in

Use the files in test/fixtures/files/blob_outputs/normal2 or make your own
first cd command should match above without the output/

cd ~/git/irida-next-core/storage/aa/bb/genkey/
mkdir output
cd output
# note: iridanext.output.json.gz must be gzipped, files in test/fixtures are not gzipped
cp ~/git/irida-next-core/test/fixtures/files/blob_outputs/normal2/iridanext.output.json.gz .
cp ~/git/irida-next-core/test/fixtures/files/blob_outputs/normal2/summary.txt .
cp ~/git/irida-next-core/test/fixtures/files/blob_outputs/normal2/analysis1.txt .
cp ~/git/irida-next-core/test/fixtures/files/blob_outputs/normal2/analysis2.txt .
cp ~/git/irida-next-core/test/fixtures/files/blob_outputs/normal2/analysis3.txt .
  1. Run the Completion Service
WorkflowExecutions::CompletionService.new(wfe).execute
  1. Check that the files are attached to the SamplesWorkflowExecutions on our WorkflowExecution
wfe.workflow_samples_executions.count
>2

wfe.workflow_samples_executions[0].outputs.count
>2

wfe.workflow_samples_executions[0].outputs[0].filename
>#<ActiveStorage::Filename:0x00007f9936da89b0 @filename="analysis1.txt">

wfe.samples_workflow_executions[0].outputs[1].filename
#<ActiveStorage::Filename:0x00007f9936deff90 @filename="analysis2.txt">

wfe.workflow_samples_executions[1].outputs.count
>1

wfe.samples_workflow_executions[1].outputs[0].filename
#<ActiveStorage::Filename:0x00007f9936df5490 @filename="analysis3.txt">
  1. Check that the metadata has been added to the SamplesWorkflowExecutions on our WorkflowExecution
wfe.samples_workflow_executions[0].metadata
>{"number"=>1, "organism"=>"an organism"}

wfe.samples_workflow_executions[1].metadata
>{"number"=>2, "organism"=>"a different organism"}

PR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

@JeffreyThiessen JeffreyThiessen force-pushed the workflow-executions/completion_service_data branch 4 times, most recently from c1f573a to 23db32c Compare March 22, 2024 19:27
@JeffreyThiessen JeffreyThiessen self-assigned this Mar 22, 2024
@JeffreyThiessen JeffreyThiessen added enhancement New feature or request database ready for review Pull request is ready for review labels Mar 22, 2024
Copy link
Member

@ericenns ericenns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really great, just a few comments. I was able to test it and had it succeed, just had to update the puids in the json file to match the samples I created.

app/services/workflow_executions/completion_service.rb Outdated Show resolved Hide resolved
@JeffreyThiessen JeffreyThiessen force-pushed the workflow-executions/completion_service_data branch from 62060bc to ae5f936 Compare March 25, 2024 16:07
Copy link

Simplecov Report

Covered Threshold
92.06% 90%

Copy link
Member

@ericenns ericenns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Copy link
Contributor

@deepsidhu85 deepsidhu85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@deepsidhu85 deepsidhu85 merged commit 1cbba1b into main Mar 25, 2024
2 checks passed
@ericenns ericenns deleted the workflow-executions/completion_service_data branch March 25, 2024 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database enhancement New feature or request ready for review Pull request is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Workflow: CompletionService attach per sample metadata/results
3 participants