Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make implicit list wrapping for singleton initial values universal #301

Closed
jakebeal opened this issue Sep 14, 2021 · 14 comments · Fixed by #382
Closed

Make implicit list wrapping for singleton initial values universal #301

jakebeal opened this issue Sep 14, 2021 · 14 comments · Fixed by #382
Milestone

Comments

@jakebeal
Copy link
Contributor

Some of our constructors allow list-valued attributes to be specified with either a list or a singleton value that is then implicitly wrapped into a list. Others do not.

For example, with the constructor for Component, the types argument has type Union[list[str], str], but with LocalSubComponent, the types argument has type list[str].

Can we make the implicit wrapping universal without it being too much of a pain?

@bbartley
Copy link
Contributor

👍

@tcmitchell
Copy link
Collaborator

🤮

I dislike this suggestion with some passion. I dislike the wrapping we already do, and I'd rather not do any wrapping at all. I am absolutely and without question opposed to doing more wrapping.

Why? Because it is counterintuitive to set Component.types to a string and then get back a list:

>>> c = sbol3.Component('http://example.com/issue301/c1', types=sbol3.SBO_DNA)
>>> c.types
['https://identifiers.org/SBO:0000251']

I think it is appropriate to treat these properties as the lists that they are:

>>> c.roles = [sbol3.SO_PROMOTER]
>>> c.roles
['https://identifiers.org/SO:0000167']

Hooray for TypeError:

>>> c.roles = sbol3.SO_PROMOTER
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/tmitchel/Projects/sd2/pySBOL3/sbol3/object.py", line 22, in __setattr__
    self.__dict__[name].set(value)
  File "/Users/tmitchel/Projects/sd2/pySBOL3/sbol3/property_base.py", line 177, in set
    raise TypeError(msg)
TypeError: roles requires one or more values packed in an iterable

If the authors of the SBOL 3 specification wanted these to be singletons (i.e. strings), they would have defined these properties to be of type string. They did not do that though. In their infinite wisdom, they defined these to unbounded properties such that they can have more than one value.

@jakebeal
Copy link
Contributor Author

One way or another, I think that it should be consistent, whether that means universal wrapping or universal not-wrapping.

@jakebeal jakebeal added this to the beta milestone Sep 23, 2021
@jakebeal
Copy link
Contributor Author

jakebeal commented Oct 7, 2021

Look at #324 for potential analog in solution.

@tcmitchell
Copy link
Collaborator

We talked about pushing the auto-wrapping down into the handling of initial_value in the various property constructors.

Care must be taken with strings because they are iterable and not really a collection for our purposes. There are a limited number of instances of ListProperty so this may be manageable. Typing may be a pain, we'll have to explore how we might make that work well.

@tcmitchell
Copy link
Collaborator

@bbartley @jakebeal should the typing of the constructor arguments indicate that they can be passed a singleton, or is it ok for them to declare that they want a list, and accept a singleton if a singleton is passed (in violation of the typing declaration).

For example, the generated_by property is a List of ReferencedObjects. It's new declaration will be:

generated_by: list[Union[Identified, str]] = None,

That indicates that the user can pass a list of items, each of which are either an Identified or a str. A user can still pass a single Identified or a single str and it will be marshaled into a list. But the declaration says it only wants a list.

Does that make sense, and is it ok with you?

@jakebeal
Copy link
Contributor Author

jakebeal commented Nov 2, 2021

I think this depends on whether we consider passing a singleton a "MAY" action or a "SHOULD NOT" action.

If it's a "MAY" action, then we'd want to have the typing indicate they can be passed as a singleton, so that one isn't being pushed away from it by warnings in the IDE.

If it's a "SHOULD" action, then we'd want to have the typing declare lists only, so that the user is chivvied away from the disliked behavior.

My personal preference is for "MAY", because there are a number of properties that allow a list, but for which the typical use case is a singleton.

@bbartley
Copy link
Contributor

bbartley commented Nov 2, 2021

If I'm reading between the lines, then I think the reason that @tcmitchell is asking is because the typing looks really hideous otherwise:

generated_by: Union[Union[Identified, str], list[Union[Identified, str]] = None

But here's an idea that might make it more palatable. Can we use Python's typing module to define a custom type? Such as:

generated_by: SingletonOrList[Union[Identified, str]] = None

@tcmitchell
Copy link
Collaborator

It's not really about the typing, although I agree that it can get ugly. I also agree that there are ways to make it more palatable.

My concern is really the think I objected to [originally(https://github.com//issues/301#issuecomment-924141962). If we spread this far and wide I think it will be confusing to users. An example is useful:

>>> import sbol3
>>> sbol3.set_namespace('https://github.com/SynBioDex/pySBOL3/issues/301')
>>> seq = sbol3.Sequence('seq1', elements='GCAT')
>>> comp = sbol3.Component('c1', types=sbol3.SBO_DNA, sequences=seq)
>>> comp.types == sbol3.SBO_DNA
False
>>> comp.types
['https://identifiers.org/SBO:0000251']
>>> comp.sequences == seq
False
>>> comp.sequences
['https://github.com/SynBioDex/pySBOL3/issues/301/seq1']

In the above example, types is set to a value, and sequences is set to a value. Yet when I compare the values afterwards, they do not match. Instead they are of a different type, a list instead of a singleton. I find that confusing behavior.

I understand that for some frequently used constructions, like Component.types it is desirable to advertise that a singleton is acceptable. I do not think we should spread this far and wide though. I'm fine with the code accepting these values, but advertising them is a bridge too far for me at the moment. I view this as SHOULD NOT. The programmer MAY set singleton values, and SHOULD NOT set singleton values.

If you can enumerate a set of values that should be advertised as allowing a singleton or a list, I would implement that. I am reluctant to advertise it everywhere. I think it will be too easy to abuse and to confuse.

>>> import sbol3
>>> sbol3.set_namespace('https://github.com/SynBioDex/pySBOL3/issues/301')
>>> seq = sbol3.Sequence('seq1', elements='GCAT')
>>> comp = sbol3.Component('c1', types=[sbol3.SBO_DNA], sequences=[seq])
>>> sbol3.SBO_DNA in comp.types
True
>>> comp.types == [sbol3.SBO_DNA]
True
>>> seq in comp.sequences
True
>>> 
>>> # Surprising!
>>> comp.sequences == [seq]
False
>>> 
>>> comp.sequences == [seq.identity]
True

@jakebeal
Copy link
Contributor Author

jakebeal commented Nov 8, 2021

I can see your point, @tcmitchell .

What do you think should be the distinction between arguments where we want to advertise flexibility, like Component.types and ones where we definitely do not want to advertise flexibility, like Component.features?

One reasonable possibility would be to just have me read through all of the properties and give a list of the ones I think we should advertise singletons for. I think that wouldn't actually be too big.

But maybe you have a more principled thought in mind?

@tcmitchell
Copy link
Collaborator

I don't have a more principled thought in mind.

And upon further reflection it also occurred to me that if it is allowed, it should be typed as allowed. In other words, make the typing reflect the code. Why hide a feature we have implemented?

Sigh.

@bbartley
Copy link
Contributor

bbartley commented Nov 8, 2021

Another approach which is perhaps more principled is to introduce some specialized classes at the library level (as opposed to the spec level). So for example, we might have a Component (singleton) vs MultisequenceComponent (list); and Interaction (singleton) vs MultitypeInteraction (list). I actually rather like this, except for the fact that it is a slippery slope that ends up at MultisequenceMultitypeComponent.

@jakebeal
Copy link
Contributor Author

jakebeal commented Nov 9, 2021

@bbartley I dislike your proposal.

@tcmitchell : I looked through and found the following multi-value properties are ones that I expect will have a singleton common case during construction:

  • Component.hasSequence
  • Component.role
  • Component.type
  • Component.hasModel
  • ExternallyDefined.type
  • Feature.role
  • Identified.wasDerivedFrom
  • Identified.wasGeneratedBy
  • Identified.hasMeasure
  • Interaction.type
  • LocalSubComponent.hasLocation
  • LocalSubComponent.type
  • Participation.role
  • SequenceFeature.hasLocation
  • SubComponent.hasLocation
  • SubComponent.sourceLocation
  • TopLevel.hasAttachment
  • VariableFeature.variantCollection
  • prov:Activity.type
  • prov:Association.hadRole
  • prov:Usage.hadRole
  • om:Measure.type

These multi-value properties, on the other hand, will generally not have singletons for their constructor calls:

  • Collection.member
  • CombinatorialDerivation.hasVariableFeature
  • Component.hasConstraint
  • Component.hasFeature
  • Component.hasInteraction
  • Component.hasConstraint
  • Experiment.member
  • Interaction.hasParticipation
  • Interface.input
  • Interface.nondirectional
  • Interface.output
  • VariableFeature.variantDerivation
  • VariableFeature.variantMeasure
  • VariableFeature.variant
  • prov:Activity.qualifiedUsage
  • prov:Activity.wasInformedBy
  • prov:Activity.qualifiedAssociation
  • om:Prefix.alternativeLabels
  • om:Prefix.alternativeSymbol
  • om:Unit.alternativeLabels
  • om:Unit.alternativeLabels

I would like to see singletons in the typing for the first list, as otherwise I will see many warnings in my code. I don't have a strong opinion about the second list.

@tcmitchell
Copy link
Collaborator

Note to self: the following remain to be done on branch 301-list-wrapping:

  • Participation.role
  • SequenceFeature.hasLocation
  • SubComponent.hasLocation
  • SubComponent.sourceLocation
  • VariableFeature.variantCollection
  • prov:Activity.type
  • prov:Association.hadRole
  • prov:Usage.hadRole
  • om:Measure.type

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants