Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move JSON-schema generation to Pydantic v2 #791

Closed
tcompa opened this issue Jul 16, 2024 · 0 comments · Fixed by #793
Closed

Move JSON-schema generation to Pydantic v2 #791

tcompa opened this issue Jul 16, 2024 · 0 comments · Fixed by #793
Assignees
Labels
dependencies Pull requests that update a dependency file JSON Schemas

Comments

@tcompa
Copy link
Collaborator

tcompa commented Jul 16, 2024

This should still expose some flexibility to build JSON schemas for packages which used Pydantic V1.

The args_schema_version can be set by one of these:

  1. Explicitly set to pydantic_v1 -> use "legacy" tools (the ones based on pydantic.v1 imports). Note that this is not guaranteed to work in case you re-use fractal-tasks-core task-argument models, as they will soon be moved to Pydantic V2 (ref Move core-library and tasks to Pydantic V2 #790).
  2. Explicitly set to pydantic_v2 -> use new tools
  3. Unset -> read the Pydantic version at import time (during manifest generation), and pick one of the previous two cases.

Note that I would keep this flexibility as a legacy feature, with the plan of deprecating it in favor of Pydantic-V2-only at some point in the future (when we assess that no relevant task package depends on Pydantic V1).


As we've known for a while, Pydantic V2 generates anyOf properties for e.g. Optional[str] arguments (ref #375). Within https://github.com/fractal-analytics-platform/fractal-tasks-core/tree/remove-anyof-from-pydantic-v2-schema, I am preparing several possible customizations of PydanticV2 schema generation (one is notably missing, namely the one where we postprocess the schema) to generate schemas that are identical or similar to the pydantic_v1 ones -- see snippet below. These solutions are working, for simple examples, but I'll need to test more systematically on the whole fractal-tasks-core (as of #790 - cc @lorenzocerrone) and/or on the other packages we use for testing JSON-schema generation.

class GenerateJsonSchemaA(GenerateJsonSchema):
def nullable_schema(self, schema):
null_schema = {"type": "null"}
inner_json_schema = self.generate_inner(schema["schema"])
if inner_json_schema == null_schema:
return null_schema
else:
debug("A: Skip calling `get_flattened_anyof` method")
return inner_json_schema
class GenerateJsonSchemaB(GenerateJsonSchemaA):
def default_schema(self, schema: WithDefaultSchema) -> JsonSchemaValue:
original_json_schema = super().default_schema(schema)
new_json_schema = deepcopy(original_json_schema)
default = new_json_schema.get("default", None)
if default is None:
debug("B: Pop None default")
new_json_schema.pop("default")
return new_json_schema
class GenerateJsonSchemaC(GenerateJsonSchema):
def get_flattened_anyof(
self, schemas: list[JsonSchemaValue]
) -> JsonSchemaValue:
# Inspired by https://github.com/vitalik/django-ninja/issues/842#issuecomment-2059014537
original_json_schema_value = super().get_flattened_anyof(schemas)
members = original_json_schema_value.get("anyOf")
debug("C", original_json_schema_value)
if (
members is not None
and len(members) == 2
and {"type": "null"} in members
):
new_json_schema_value = {"type": [t["type"] for t in members]}
debug("C", new_json_schema_value)
return new_json_schema_value
else:
return original_json_schema_value
class GenerateJsonSchemaD(GenerateJsonSchema):
def get_flattened_anyof(
self, schemas: list[JsonSchemaValue]
) -> JsonSchemaValue:
# Inspired by https://github.com/vitalik/django-ninja/issues/842#issuecomment-2059014537
null_schema = {"type": "null"}
if null_schema in schemas:
debug("D drop null_schema before calling `get_flattened_anyof`")
schemas.pop(schemas.index(null_schema))
return super().get_flattened_anyof(schemas)
class GenerateJsonSchemaE(GenerateJsonSchemaD):
def default_schema(self, schema: WithDefaultSchema) -> JsonSchemaValue:
json_schema = super().default_schema(schema)
debug("E", json_schema)
if "default" in json_schema.keys() and json_schema["default"] is None:
debug("E: Pop None default")
json_schema.pop("default")
return json_schema
CustomGenerateJsonSchema = GenerateJsonSchema
CustomGenerateJsonSchema = GenerateJsonSchemaA
CustomGenerateJsonSchema = GenerateJsonSchemaB
CustomGenerateJsonSchema = GenerateJsonSchemaC
CustomGenerateJsonSchema = GenerateJsonSchemaD
CustomGenerateJsonSchema = GenerateJsonSchemaE
def _create_schema_for_function(function: Callable) -> _Schema:
namespace = _typing_extra.add_module_globals(function, None)
gen_core_schema = _generate_schema.GenerateSchema(
ConfigWrapper(None), namespace
)
core_schema = gen_core_schema.generate_schema(function)
clean_core_schema = gen_core_schema.clean_schema(core_schema)
gen_json_schema = CustomGenerateJsonSchema()
json_schema = gen_json_schema.generate(
clean_core_schema, mode="validation"
)
return json_schema

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file JSON Schemas
Projects
Development

Successfully merging a pull request may close this issue.

1 participant