add discriminator property support #214

eli-bl · 2024-10-28T19:49:40Z

See description in my PR against the upstream repo: openapi-generators#1154

Why we need this fix: Both the v3 and v2 API specs make heavy use of discriminators for polymorphic types.

Without discriminator support, polymorphism will only work for types that have mutually exclusive validation. That's because, without a discriminator, it'll just try deserializing as each of the possible types in the order that they're declared, until deserialization succeeds for one of them. For instance, something like this does work without a discriminator:

components:
  schemas:
    TypeA:
      properties:
        modelType:
          type: string
        prop1:
          type: string
      required: ["prop1"]
    TypeB:
      properties:
        modelType:
          type: string
        prop2:
          type: string
      required: ["prop2"]
    UnionType:
      oneOf:
        - $ref: "#/components/schemas/TypeA"
        - $ref: "#/components/schemas/TypeB"

—because if you try to deserialize {"modelType": "TypeA", "prop2": "x"} as UnionType, it will first try to deserialize it as TypeA, which will fail because it doesn't have a prop1; then it'll try to deserialize it as TypeB, which will pass. But it would not work if you had this:

components:
  schemas:
    TypeA:
      properties:
        modelType:
          type: string
        everyoneHasThisProp:
          type: string
      required: ["everyoneHasThisProp"]
    TypeB:
      properties:
        modelType:
          type: string
        everyoneHasThisProp:
          type: string
      required: ["everyoneHasThisProp"]
    UnionType:
      oneOf:
        - $ref: "#/components/schemas/TypeA"
        - $ref: "#/components/schemas/TypeB"

—because then the same data would pass deserialization for either of those types and so you would always get the first one, TypeA. The correct way to do something like that with a discriminator is:

components:
  schemas:
    TypeA:
      properties:
        modelType:
          type: string
        everyoneHasThisProp:
          type: string
      required: ["everyoneHasThisProp"]
    TypeB:
      properties:
        modelType:
          type: string
        everyoneHasThisProp:
          type: string
      required: ["everyoneHasThisProp"]
    UnionType:
      oneOf:
        - $ref: "#/components/schemas/TypeA"
        - $ref: "#/components/schemas/TypeB"
      discriminator:
        property: "modelType"

I tested this by running the generator locally with this branch against the current v3 API draft spec from aurelia. The spec currently has a lot of invalid uses of discriminator which cause appropriate complaints (like "All schema variants must be objects when using a discriminator"), but it also has a valid one in the type Benchling.CreateEntryContentInput, which is used by the endpoint Benchling.Entry.Update. I verified that the generated code for that endpoint now has the same kind of logic in it that's in the example file model_with_discriminated_union.py in this PR.

eli-bl · 2024-10-28T20:48:00Z

end_to_end_tests/baseline_openapi_3.0.json

@@ -2841,15 +2841,17 @@
          "modelType": {
            "type": "string"
          }
-        }
+        },
+        "required": ["modelType"]


This is necessary in order for the discriminator in the spec to really be valid: a discriminator property must be a required property in all of the variant schemas. My current implementation wouldn't actually catch a mistake like this, but I figured it was best to have the test spec be valid.

eli-bl · 2024-10-28T20:49:00Z

end_to_end_tests/golden-record/my_test_api_client/models/a_discriminated_union_type_1.py

 T = TypeVar("T", bound="ADiscriminatedUnionType1")


 @_attrs_define
 class ADiscriminatedUnionType1:
    """
    Attributes:
-        model_type (Union[Unset, str]):


The changes in this generated file and the next one are because I changed modelType to be required in the test spec.

eli-bl · 2024-10-28T20:54:36Z

end_to_end_tests/golden-record/my_test_api_client/models/model_with_discriminated_union.py

+                        raise TypeError()
+                    componentsschemas_a_discriminated_union_type_1 = ADiscriminatedUnionType1.from_dict(data)
+
+                    return componentsschemas_a_discriminated_union_type_1


Lines 68-72 here are boilerplate that comes from the generator's template macro for "call the deserializer for some type". The weird function name on line 67 is due to how the generator makes unique Python names for stuff in its internal data structure.

eli-bl · 2024-10-28T20:54:58Z

end_to_end_tests/golden-record/my_test_api_client/models/model_with_discriminated_union.py

+                    return cast(
+                        Union["ADiscriminatedUnionType1", "ADiscriminatedUnionType2", None, Unset], _parse_fn(data)
+                    )
+            raise TypeError("unrecognized value for property modelType")


Lines 81-89 are from my new custom template logic for discriminators.

eli-bl · 2024-10-28T20:57:28Z

openapi_python_client/parser/properties/union.py

@@ -67,16 +75,7 @@ def build(
                return PropertyError(detail=f"Invalid property in union {name}", data=sub_prop_data), schemas
            sub_properties.append(sub_prop)

-        def flatten_union_properties(sub_properties: list[PropertyProtocol]) -> list[PropertyProtocol]:


Moved this helper function to line 204 (it doesn't need to be an inner function because it doesn't rely on any closure variables).

eli-bl · 2024-10-28T21:07:03Z

openapi_python_client/parser/properties/union.py

-            return flattened
-
-        sub_properties = flatten_union_properties(sub_properties)
+        sub_properties, discriminators_list = _flatten_union_properties(sub_properties)


This flattening logic is a pre-existing behavior of the generator, causing it to treat this:

oneOf: - $ref: "#/components/schemas/TypeA" oneOf: - $ref: "#/components/schemas/TypeB" - $ref: "#/components/schemas/TypeC"

—as exactly equivalent to this:

oneOf: - $ref: "#/components/schemas/TypeA" - $ref: "#/components/schemas/TypeB" - $ref: "#/components/schemas/TypeC"

But now we have to also keep track of any discriminator definitions that might have existed in a nested union. For instance, it's totally valid to have something like this—

UnionType: oneOf: - oneOf: - $ref: "#/components/schemas/TypeA" - $ref: "#/components/schemas/TypeB" discriminator: property: prop1 - oneOf: - $ref: "#/components/schemas/TypeC" - $ref: "#/components/schemas/TypeD" discriminator: property: prop2

In this example, TypeA and TypeB both have the property prop1, whose value will be "TypeA" or "TypeB"; but with TypeC and TypeD, the relevant property is prop2. The discriminators_list returned by _flatten_union_properties in this case looks like: `[DiscriminatorDefinition(property_name="prop1", value_to_model_map={"TypeA": , "TypeB": }),DiscriminatorDefinition(property_name="prop1", value_to_model_map={"TypeA": , "TypeB": }), DiscriminatorDefinition(property_name="prop2", value_to_model_map={"TypeC": , "TypeD": })]. The Jinja template logic will use this to build an appropriate piece of Python deserializer code where first it checks for prop1, then for prop2, etc.

eli-bl · 2024-10-28T21:09:24Z

openapi_python_client/parser/properties/union.py

@@ -10,7 +10,14 @@
 from ...utils import PythonIdentifier
 from ..errors import ParseError, PropertyError
 from .protocol import PropertyProtocol, Value
-from .schemas import Schemas
+from .schemas import Schemas, get_reference_simple_name, parse_reference_path
+


This file defines UnionProperty, which is the generator's internal class for representing any kind of anyOf or oneOf situation.

When parsing a schema, the generator will call UnionProperty.build (line 38), which returns a tuple of 1. either a valid UnionProperty or an error, and 2. an updated version of the Schemas object that keeps track of all parsed schemas so far. This is a standard pattern used with all of the generator's ___Property types. In general, error handling is done via this pattern instead of using exceptions.

eli-bl · 2024-10-28T21:11:23Z

openapi_python_client/parser/properties/union.py

+        if discriminators_list:
+            if error := _validate_discriminators(discriminators_list):
+                return error, schemas
+            prop = evolve(prop, discriminators=discriminators_list)


Using attrs.evolve is standard in this codebase; some classes do have mutable fields, but in general mutation is avoided.

eli-bl · 2024-10-28T21:22:40Z

openapi_python_client/parser/properties/union.py

+        # This is needed because, when we built the union list, $refs were changed into a copy of
+        # the type they referred to, without preserving the original name. We need to know that
+        # every type in the discriminator is a $ref to a top-level type and we need its name.
+        for prop in schemas.classes_by_reference.values():


What's going on here:

The generator defines schemas.classes_by_reference as a map that keeps track of all named schemas— that is, ones that are defined in components.schemas rather than being anonymous inline schemas inside something else. So, for this spec—

components: schemas: ModelA: # ... ModelB: # ... UnionAB: oneOf: - $ref: "#/components/schemas/ModelA" - $ref: "#/components/schemas/ModelB"

—the map will end up looking like this:

{ "/components/schemas/ModelA": model_property_instance_for_model_a, "/components/schemas/ModelB": model_property_instance_for_model_b, "/components/schemas/UnionAB": union_property_instance_for_union_ab, }

It uses that map when it's resolving the $refs inside UnionAB. However, when it does that, the ModelProperty instances that are inside of the UnionProperty are not the same instances as model_property_instance_for_model_a, etc. They are newly parsed ones made from a copy of the YAML data of the original declaration. The only way to find the original instance is what I'm doing here: check the class_info attribute, which describes the Python class (ModelA or whatever) that will be used for this schema in the generated code.

eli-bl · 2024-10-28T21:25:27Z

openapi_python_client/templates/property_templates/union_property.py.jinja

@@ -1,3 +1,36 @@
+{% macro construct_inner_property(inner_property) %}


This macro is factored out of old lines 27-32.

eli-bl · 2024-10-28T21:28:08Z

openapi_python_client/templates/property_templates/union_property.py.jinja

-            raise TypeError()
-        {{ inner_template.construct(inner_property, "data") | indent(8) }}
-        return {{ inner_property.python_name }}
+{{ construct_inner_property(inner_property) | indent(8, True) }}


Even though you can see the logic here is slightly different from old lines 27-32 (it doesn't have the if/endif that are on old lines 27 and 30), I'm using the same macro for convenience. That works fine because we already know at this point, due to the test on old line 18, that inner_template.check_type_for_construct is true.

eli-bl · 2024-10-28T21:29:59Z

openapi_python_client/templates/property_templates/union_property.py.jinja

+
+{% macro construct_discriminator_lookup(property) %}
+{% set _discriminator_properties = [] -%}
+{% for discriminator in property.discriminators %}


As a result of the new logic in union.py, `property.discriminators will look like this:

[ { "property_name": <name of the discriminator property>, "value_to_model_map": { "somePossibleValue": <ModelProperty instance corresponding to that value> } } ]

There can be multiple discriminators due to the nested-unions situation I described in union.py.

damola-benchling

This looks good! Especially this final iteration - much easier to follow, for me.

The, perhaps not very useful commentary I have is that, this code is definitely complex.
I know this is because the spec itself is implicitly complex with how discriminators are defined/supposed to be handled 😞 .
I don't have a recommendation on how to represent this better especially because it seems to me that this implementation pretty much follows the conventions of the rest of the code base and I think that's about as good as it will be.

damola-benchling · 2024-11-06T21:51:16Z

openapi_python_client/parser/properties/union.py

+    if data.mapping:
+        for discriminator_value, model_ref in data.mapping.items():
+            if "/" in model_ref:
+                ref_path = parse_reference_path(model_ref)


this is a nit comment: while I understand the need for this logic of differentiating between the name of a model vs the path+name.... something about this feels brittle.
I wonder if this is a general thing that can be made into a utility (this is mostly vibes tho)

As far as I know, there is no other area of OpenAPI/JSON Schema where you can reference a schema by its simple name instead of its path. It's not a general thing, and there is no syntactic way to disambiguate them except by checking for a slash.

damola-benchling · 2024-11-06T22:23:33Z

openapi_python_client/parser/properties/union.py

@@ -24,6 +53,7 @@ class UnionProperty(PropertyProtocol):
    description: str | None
    example: str | None
    inner_properties: list[PropertyProtocol]
+    discriminators: list[DiscriminatorDefinition] | None = None


Just want to confirm that this means that the discriminator logic defined here will automatically work for every where polymorphism is supported in a spec including in the request/response bodies?

Yes. There isn't separate deserialization logic for request vs. response bodies; they just call from_dict on the model class.

eli-bl marked this pull request as ready for review October 28, 2024 20:47

eli-bl requested a review from damola-benchling October 28, 2024 20:47

eli-bl commented Oct 28, 2024

View reviewed changes

eli-bl force-pushed the discriminators branch from caa81a7 to 2136c14 Compare October 28, 2024 20:56

eli-bl commented Oct 28, 2024

View reviewed changes

eli-bl force-pushed the discriminators branch from 2136c14 to 1956f68 Compare October 28, 2024 21:30

add discriminator property support

0496ff8

eli-bl force-pushed the discriminators branch from 1956f68 to 0496ff8 Compare October 28, 2024 21:33

eli-bl added 2 commits October 30, 2024 15:25

simply the discriminator logic a bit

b75adda

remove new category of tests for now

2941451

eli-bl force-pushed the discriminators branch from 9e71936 to 2941451 Compare October 31, 2024 17:18

eli-bl added 2 commits November 5, 2024 13:29

lint

d189f77

lint

de1ddf3

eli-bl force-pushed the discriminators branch from 344ec97 to de1ddf3 Compare November 5, 2024 23:38

handle a case where there's multiple values mapped to same type

9071de0

damola-benchling approved these changes Nov 6, 2024

View reviewed changes

eli-bl merged commit d76c0fc into 2.x Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add discriminator property support #214

add discriminator property support #214

eli-bl commented Oct 28, 2024 •

edited

Loading

eli-bl Oct 28, 2024

eli-bl Oct 28, 2024

eli-bl Oct 28, 2024

eli-bl Oct 28, 2024

eli-bl Oct 28, 2024

eli-bl Oct 28, 2024

eli-bl Oct 28, 2024 •

edited

Loading

eli-bl Oct 28, 2024

eli-bl Oct 28, 2024

eli-bl Oct 28, 2024

eli-bl Oct 28, 2024

eli-bl Oct 28, 2024

damola-benchling left a comment

damola-benchling Nov 6, 2024

eli-bl Nov 6, 2024

damola-benchling Nov 6, 2024

eli-bl Nov 6, 2024

		@@ -1,3 +1,36 @@
		{% macro construct_inner_property(inner_property) %}

add discriminator property support #214

add discriminator property support #214

Conversation

eli-bl commented Oct 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eli-bl Oct 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

damola-benchling left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eli-bl commented Oct 28, 2024 •

edited

Loading

eli-bl Oct 28, 2024 •

edited

Loading