Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions to check struct and array missingness #738

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

klaricch
Copy link
Contributor

No description provided.

@klaricch klaricch requested a review from matren395 November 25, 2024 15:19
@klaricch klaricch requested a review from ch-kr December 5, 2024 21:53
@klaricch klaricch removed the request for review from matren395 December 5, 2024 21:53
Copy link
Contributor

@ch-kr ch-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few questions. could you also add tests for the new functions?

gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
struct_expr: hl.expr.StructExpression, prefix: str = ""
) -> dict:
"""
Recursively check the fraction of missing values of all fields within a StructExpression.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this faster than running .flatten() and then checking for missingness on the flattened struct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also just noticed this: https://hail.is/docs/0.2/hail.expr.StructExpression.html#hail.expr.StructExpression.summarize -- do you know if this is too slow to run at scale? I guess it might be annoying to read through too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah the summarize adds a lot of other things that we don't need

and not sure if this is any faster, i could just flatten them as well and then check missingness

@klaricch klaricch requested a review from ch-kr December 9, 2024 18:42
Copy link
Contributor

@ch-kr ch-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a question and a few more small suggestions. also reminder to add tests!

gnomad/assessment/validity_checks.py Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
@klaricch klaricch requested a review from ch-kr December 11, 2024 21:51
@klaricch
Copy link
Contributor Author

will start adding tests

Copy link
Contributor

@ch-kr ch-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a small question -- thanks for working on the tests!

gnomad/assessment/validity_checks.py Show resolved Hide resolved
@klaricch klaricch requested a review from ch-kr December 17, 2024 17:29
Copy link
Contributor

@ch-kr ch-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding tests! just a few small suggestions and questions

gnomad/assessment/validity_checks.py Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Outdated Show resolved Hide resolved
tests/assessment/test_validity_checks.py Show resolved Hide resolved
tests/assessment/test_validity_checks.py Outdated Show resolved Hide resolved
gnomad/assessment/validity_checks.py Show resolved Hide resolved
tests/assessment/test_validity_checks.py Outdated Show resolved Hide resolved
tests/assessment/test_validity_checks.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants