Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JPSG instead of gen.py in eclipse-collections; extract some common functionality into primitive-collections-base module #51

Merged
merged 7 commits into from
Oct 21, 2019

Conversation

leventov
Copy link

This PR is half of the work that should be done towards #38.

It extracts common functionality from eclipse-collections module into primitive-collections-base.

It also replaces the use of eclipse-collections' gen.py with JPSG which allows removing huge amount of repetitive code across primitive collection combinations (especially maps).

…n functionality into primitive-collections-base module
@cowtowncoder
Copy link
Member

Ok, so I think this makes sense at high level. Just couple of questions:

  1. Would it make sense to separate JPSG change from others -- also I'd like @yawkat 's input on this change
  2. What would make most sense wrt 2.10 vs 3.0: adding base classes makes sense for 3.0 (master), but I wonder if there's any intermediate step that would make sense for 2.10 -- maybe base classes could be for new fastutil module, and 3.0 would then refactor existing ones (include eclipse-collections) to use it?

@yawkat
Copy link
Member

yawkat commented May 22, 2019

I agree with this change in principle, the python script was a hack, this is much better. The code looks good too. My only concern is compatibility, I'm not entirely sure the test suite I wrote covers everything, but these changes look solid enough

@yawkat
Copy link
Member

yawkat commented May 22, 2019

Oh also, the package name is a bit odd (underscores) and against the google style guide, but I don't know how jackson handles that (it doesn't bother me too much)

@cowtowncoder
Copy link
Member

Ok so if JPSG change could be applied against 2.10, that'd be great I think. And maybe separate it into different PR. Would be happy to merge (I assume we have a CLA from you @leventov ? If not -- I can verify later when I have access to filed ones -- template is here https://github.com/FasterXML/jackson/blob/master/contributor-agreement.pdf )

@leventov
Copy link
Author

What would make most sense wrt 2.10 vs 3.0: adding base classes makes sense for 3.0 (master), but I wonder if there's any intermediate step that would make sense for 2.10 -- maybe base classes could be for new fastutil module, and 3.0 would then refactor existing ones (include eclipse-collections) to use it?

Am I understand right that you propose to bring fastutil to 2.10 by just adding the new module and the base module without basically touching the eclipse-collections module, while restructuring eclipse-collections module only in master? Then, this PR (to master) shouldn't be changed much?

I can create a different PR to 2.10, but separating JPSG change would be hard because unfortunately, I was doing the changes (JPSG and refactoring into a basic module) in parallel and they are not quite a clear cut, so if possible I would like to avoid doing that.

I've sent CLA to [email protected].

@cowtowncoder
Copy link
Member

Correct, I was thinking of adding with minimal changes in 2.10.

However. If and when changes are only for Eclipse-collections and not for earlier added collections (...until master), actually that is fine: EC is officially only added in 2.10, so I think even major refactoring is fine.

So getting it all in for 2.10 just for EC/Fastutil is fine.

@leventov
Copy link
Author

Is there something more that has to be done in this PR?

@@ -156,9 +155,11 @@
// are faster to construct.

// initialized below
static final Map<Class<? extends PrimitiveIterable>, JsonDeserializer<?>> PRIMITIVE_DESERIALIZERS = new HashMap<>();
static final Map<Class<? extends PrimitiveIterable>, JsonDeserializer<?>> PRIMITIVE_DESERIALIZERS
= new IdentityHashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason to use IdentityHashMap here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IdentityHashMap should be a little faster in this case, which may not be a totally pointless optimization because getting serializers/deserializers by class is an operation on the hot path. I use IdentityHashMap in new code (PrimitivePrimitiveMapSerializers - in this PR, and in the (yet unchecked) fastutil module which should come in a follow-up PR), so I aligned existing code to use IdentityHashMap for consistency, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be good to measure this. I do remember Class<?> lookup being on hot path because it does not override default equals() or hashCode(), but I also recall IdentityHashMap itself not being super efficient. What core databind does is actually uses ClassKey type, but while I think that removed a hot spot I don't think I ever properly verified its effect with JMH.

Having said all of that, I do not think this lookup should be on critical path as DeserializerCache used by jackson-databind should handle caching by type and this provider should not be called more than once.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, reverted to use HashMap

@SuppressWarnings("rawtypes")
static final Set<Class<? extends InternalIterable>> REFERENCE_TYPES = new HashSet<>();
static final Set<Class<? extends InternalIterable>> REFERENCE_TYPES =
Collections.newSetFromMap(new IdentityHashMap<>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here... what is the point of change?

@cowtowncoder
Copy link
Member

Added one question (well 2 but basically the same). But aside from that, if the hope is to get this in 2.10, it'd need to be based from 2.10. Would it be possible to rebase changes to start from 2.10 branch instead of master?

@leventov
Copy link
Author

Although this PR now passes CI, it must not be merged quite yet, because I suspect that module-info.java is not correct. #61 for details.

@leventov
Copy link
Author

@cowtowncoder I've retained requires transitive in primitive-collections-base's module-info.java because it seems reasonable. Also added an entry to eclipse-collections' module-info.java.

I believe now this PR should be ready for merge.

@cowtowncoder
Copy link
Member

Ok I hope to have another look, will add to my TODO list.
@yawkat Any other last comments?

<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<excludes>
<exclude>com/fasterxml/jackson/**/failing/*.java</exclude>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still required? I assume this was for #54

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this from jackson-datatype-collections/pom.xml. It is a general config, apparently, for local development, where failing tests could be temporarily moved to ...failing package. Not related to #54.

However, this config is inherited from jackson-datatype-collections/pom.xml in eclipse-collections.xml anyway, so I removed it here.

I've also found that guava, pcollections, and primitive-collections-base lacked the spec of maven-surefire-plugin and the tests were not run during install goal. Added maven-surefire-plugin to the config of these modules.

@yawkat
Copy link
Member

yawkat commented Oct 20, 2019

I have two comments.

  • Not sure if this came up, but is the package name primitive_collections_base okay? I know some but not all code styles would forbid it.
  • The generated artifacts look odd to me. First, the binary artifacts of both the eclipse-collections and the primitive-collections-base modules seem to contain the classes in the primitive_collections_base package. Second, the source artifacts (as generated by source:jar) seem to contain the generated code in the jpsg top-level package which does not match the location in the binary jar. Not sure if this is intentional.

…ve-collections-base; remove unnecessary surefire plugin config in eclipse-collections
…collections-base/pom.xml to align with the rest of the project
@leventov
Copy link
Author

@yawkat

Second, the source artifacts (as generated by source:jar) seem to contain the generated code in the jpsg top-level package which does not match the location in the binary jar. Not sure if this is intentional.

It seems to be the way Maven's source code generation convention works. I don't see what I could change in this code: https://github.com/TimeAndSpaceIO/jpsg-maven-plugin/blob/master/src/main/java/io/timeandspace/jpsg/GeneratorMojo.java to fix this, even if this is considered a problem. Presumably, Antlr and other code generators work the same way. Theoretically, it's possible to specify generic generated-sources/ output directory in plugin configuration instead of generated-sources/jpsg/, but that's not the best practice because it would make unclear from where certain generated sources originate.

guava/pom.xml Outdated
@@ -57,6 +57,10 @@ com.google.common.*;version="${version.guava.osgi}",
<groupId>org.moditect</groupId>
<artifactId>moditect-maven-plugin</artifactId>
</plugin>
<plugin>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this would be needed: parent pom (jackson-datatypes-collections) has it with exclusion for failing (which is to allow running failing-for-now tests from IDE without blocking build).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mvn clean install output with and without this addition is different. Apparently, Guava's tests are not run without it. The spec in jackson-datatypes-collections is in <pluginManagement> element, while here it's in <build><plugins>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is odd: for me mvn clean install from main level does run all the tests on master.
Note, too, that none of child projects has separate surefire run settings and tests are run.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. maven-surefire-plugin config in <build><plugins> is actually inherited from Jackson's oss-parent pom. This could be verified by generating Effective Poms. And now, I cannot reproduce the tests not running without these configs (they run now for me), maybe it was my mental glitch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Stuff happens, I just wanted to make sure I understood reasoning -- I would not want tests to be accidentally skipped for sure.

I hope to merge this soon, today or tomorrow. Thank you for all the work here!

@@ -46,6 +46,10 @@ PCollections (http://pcollections.org/) types
<groupId>org.moditect</groupId>
<artifactId>moditect-maven-plugin</artifactId>
</plugin>
<plugin>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, parent has it.

@cowtowncoder
Copy link
Member

cowtowncoder commented Oct 20, 2019

One more question: would hppc depend on the new primitive-base jar in future? Or just the fastutils one.

@leventov
Copy link
Author

One more question: would hppc depend on the new primitive-base jar in future? Or just the fastutils one.

Personally, I plan just to add a module for fastutils as the next step.

@cowtowncoder cowtowncoder merged commit 66f7307 into FasterXML:master Oct 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants