Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polymorphic subtype deduction from available fields #2813

Merged
merged 5 commits into from
Aug 21, 2020
Merged

Polymorphic subtype deduction from available fields #2813

merged 5 commits into from
Aug 21, 2020

Conversation

drekbour
Copy link
Contributor

@drekbour drekbour commented Aug 9, 2020

AsDeductionTypeDeserializer implements inferential deduction of a subtype from the fields. As a POC not intended for merging, tere's an amount of cut'n'paste code etc but I thought a functional PR would be the best basis for discussion of something I write out of interest.

It works by fingerprinting the full set of possible fields of each subtype on registration. On deserialisation, available fields are compared to those fingerprints until only one candidate remains. It specifically only looks at immediate-child field names as is immediate-child values are covered by existing mechanisms and deeper analysis is a much more imposing ML task not really part of Jackson's remit.

  // @JsonTypeResolver(PolymorphicDeductionTypeResolver.class) // Required for POC
  // @JsonTypeInfo(use = JsonTypeInfo.Id.CLASS) // Required for POC
  @JsonTypeInfo(use = DEDUCTION) // Intended usage
  @JsonSubTypes( {@Type(Group.class), @Type(Individual.class)})
  public class Identity {
    public String name;
  }
  public class Group extends Identity {
    public List<Identity> members;
  }
  public class Individual extends Identity {
    public String surname;
  }

turns into fingerprints (stored as a BitSet)

  • Group -> (name, members)
  • Individual -> (name, surname)

Which will deserialise the following without it requiring any type markers

{
  "name": "Xmas Crew",
  "members": [
    {
      "name": "Santa",
      "surname": "Claus"
    },
    {
      "name": "Elves",
      "members": []
    },
    {
      "name": "Reindeers",
      "members": [
        {
          "name": "Rudolph",
          "surname": "Red Nosed"
        }
      ]
    }
  ]
}

When combined with a tool to auto-register subtypes by classpath-scanning, this provides a pure-Java means to extend parent types defined in upstream libraries (imagine a third-party project defines class Bot extends Identity)

Against number of fields, it is O(1) space and O(N) time (worst case being unique match always on last incoming field). It has all the obvious caveat-emptor about trusting the source of incoming data etc

The ideal goal would be to support this through the core typing annotations as @JsonTypeInfo(use = DEDUCTION). Please provide some feedback and let me know if it's worth my time to rewrite this full integration

@cowtowncoder
Copy link
Member

Wow. Thank you for contributing this! I will need to read through this with thought and see what I think wrt implementation -- the idea itself has been around for a while and is highly requested. I think #43 is the matching issue?


BitSet fingerprint = new BitSet(nextField.get() + properties.size());
for (BeanPropertyDefinition property : properties) {
fingerprint.set(fieldBitIndex.computeIfAbsent(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I think this is Java 8 feature, Jackson 2.x (at least until 2.12) can not make use of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wrote this POC as a standalone deserializer against 2.11 first so have the non J8 version already to hand (hardly rocket science). Are you saying that master (3.x) must still use Java 7 JDK ops so they can be backported (2.x)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, master (3.x) can use Java 8 features just fine. 2.x not. Actually I did not notice this was against master, so in that sense it would actually work.
But I think I would really like to get this in for 2.12 :)

@cowtowncoder
Copy link
Member

Interestingly simple, I like this as a starting point! I think I would like to extend support via dedicated annotation changes: this would require support on serialization side too (probably nothing too complex as it should be something like As.EXISTING_PROPERTY, i.e. nothing to add on output).

Question of duplicates is interesting, what to do; my initial instinct is to force an exception in such cases to avoid both odd failure cases ("why did it deserialize as X and not Y") and especially "accidentally working first, the not" (if choice of which subtype to use was not stable). But if no exception thrown, specific rule should be clear (like alphabetic ordering of full class name, take first).

@drekbour
Copy link
Contributor Author

Fail-fast is the goal here. If detected at initialisation then I simply didn't know what Exception class to throw so left it commented out. Please advise.

Proposed solution works only for mutually-exclusive subtypes where one and only one must match. There is no ordering requirement and it's simple to infer what's happening inside the box.
If we relax that we have the problem that either the runtime is simply ambiguous or the user explicitly registered ambiguous types (a parent as well as its children). I'll be honest - there are a few things to think about here and I'm ... not going to. An initial implementation using mutually-exclusive subtypes will make a lot of people happy without limiting future direction (I think posers call this "lean" :))

(!) I'll see what I can do to generalise this towards becoming @JsonTypeInfo(use = DEDUCTION) but I need to know if I should be branching from master or some 2.x branch.

@cowtowncoder
Copy link
Member

Agreed, simple mutually-exclusive version first makes sense. I think that we will actually get flattened hierarchy already (that is, @JsonSubTypes are recursively followed; and there is no way to define "nested" type information either).

I would really like to see this for 2.12, given that it has been requested quite often by users.

As to exception type, if none of existing exceptions under com.fasterxml.jackson.databind.exc (I think?) works, adding a new one would be fine. But I think there is something already for "unknown type id" or such, which perhaps should work -- and minor additions/modifications to exceptions should be fine too, as long as they are backwards-compatible.

@drekbour drekbour changed the base branch from master to 2.12 August 12, 2020 17:13
@drekbour drekbour changed the title POC polymorphic subtype deduction from available fields Polymorphic subtype deduction from available fields Aug 12, 2020
@drekbour
Copy link
Contributor Author

Depends upon annotation update here: FasterXML/jackson-annotations#175
Is otherwise good to go.

@cowtowncoder
Copy link
Member

Apologies for the delay: I finally managed to merge Java14 Record support and will focus on this big ticket item next!

@cowtowncoder
Copy link
Member

Merged in both for 2.12 and master, will be in 2.12.0!

@drekbour
Copy link
Contributor Author

Psst, you put my name in the release notes as Mark not Marc :)

@cowtowncoder
Copy link
Member

@drekbour Sorry about that. Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants