Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support Optional samplesheet parameters #475

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

msto
Copy link

@msto msto commented Jul 19, 2024

Workflow parameters which are mapped to rows in the Latch registry must be declared as a list of dataclasses in the workflow function.

When such a parameter is part of a ForkBranch (e.g. when maintaining backwards compatibility for a workflow which previously supported input of a manually generated samplesheet), the parameter does not receive a value from Flyte and must have a default value defined in the workflow function signature. It would be preferable to set this default value to None rather than an empty list, to avoid providing a mutable argument default.

It would be helpful if the workflow parameter could be made Optional to support this use case.

This PR adds support for Optional samplesheet parameters.

@msto msto requested a review from ayushkamat as a code owner July 19, 2024 14:40
@msto msto marked this pull request as draft July 19, 2024 14:52
doc: add docs

doc: fix TypeError messages

fix: remove unguarded UnionType import

test: fix imports
@msto msto force-pushed the ms_optional-samplesheet-param branch from 64afbd4 to 21590f9 Compare July 19, 2024 14:57
@msto msto marked this pull request as ready for review July 19, 2024 14:58
@msto
Copy link
Author

msto commented Jul 19, 2024

Unit tests pass locally for me under both 3.9 and 3.10, and mypy identifies no issues on the modified lines.

Copy link
Contributor

@ayushkamat ayushkamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! I appreciate that you took the time to look through PEP 585 and PEP 604 and provide some clarification there.

Overall this is much more complexity than necessary, given that we will likely not be reusing this code for anything else and that what is written is highly specified to this problem.

Heres how I would do it:

if sys.version_info >= (3, 10):
    from types import UnionType
else:

    class UnionType: ...

def is_valid_samplesheet_param(typ: Type) -> bool:
    origin = get_origin(typ)
    if origin is None:
        return False

    args = get_args(typ)

    if origin is Union or issubclass(origin, UnionType):
        if len(args) != 2:
            return False

        for arg in args:
            if arg is type(None) or is_valid_samplesheet_param(arg):
                continue

            return False

        return True

    return issubclass(origin, list) and len(args) == 1 and is_dataclass(args[0])

latch/resources/workflow.py Outdated Show resolved Hide resolved
latch/resources/workflow.py Outdated Show resolved Hide resolved
Comment on lines +157 to +159
# If the parameter did not have a type annotation, short-circuit and return False
if not _is_type_annotation(annotation):
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the wrong place for this check, i cant remember off the top of my head if we throw a semantic error already for not having a type annotation but we should do this outside of a samplesheet specific check (itll always be an error regardless of whether or not the specific param is a samplesheet)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the project does not enforce any type checking, I would prefer to include runtime checks on all argument types.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm happy to make this raise a TypeError instead, though in context I thought returning False was sensible.)

latch/resources/workflow.py Outdated Show resolved Hide resolved
args = get_args(dtype)

return (
not _is_optional_type(dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

del

Copy link
Author

@msto msto Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some form of this is required.

The call to issubclass on the next line will raise a TypeError if it is passed an object that isn't a class, which typing.Union is not.

i.e. the following raises an error:

from typing import Optional, get_origin
origin = get_origin(Optional[T])

issubclass(origin)

I've replaced with a requirement that origin is a class.

latch/resources/workflow.py Outdated Show resolved Hide resolved
latch/resources/workflow.py Outdated Show resolved Hide resolved
Copy link
Contributor

@nh13 nh13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super helpful, thank-you @msto. @ayushkamat what do we need to do to get this merged and released?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants