Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of cross-document refs has an important impact on performance #19

Open
cbrun opened this issue Mar 15, 2023 · 3 comments · May be fixed by #44
Open

Handling of cross-document refs has an important impact on performance #19

cbrun opened this issue Mar 15, 2023 · 3 comments · May be fixed by #44
Assignees

Comments

@cbrun
Copy link

cbrun commented Mar 15, 2023

From a decent size model, which have no inter document cross ref but quite a few internal references.
EObjects: 3894
References with values: 3121

And using:
String json = JsonResourceImpl.toJson(r, Collections.EMPTY_MAP);

Serialization of my model takes 77606 ms (77 secs) which seems a lot compared to serialization/deserialization using BinaryResource which takes about 400ms.

String size:7126813 time: 77606 ms.

After profiling in tracing mode, in my case it all comes down to : org.eclipse.sirius.emfjson.utils.GsonEObjectSerializer.docKindMany(EObject, EReference)
and the following code which, I assume, try to detect whether there is a cross document reference to serialize:

        Iterator<? extends InternalEObject> it = internalEList.iterator();
        while (referenceType != GsonEObjectSerializer.SKIP && referenceType != GsonEObjectSerializer.CROSS_DOC && it.hasNext()) {
            InternalEObject internalEObject = it.next();
            if (internalEObject.eIsProxy()) {
                referenceType = GsonEObjectSerializer.CROSS_DOC;
            } else {
                Resource resource = internalEObject.eResource();
                if (resource != this.helper.getResource() && resource != null) {
                    referenceType = GsonEObjectSerializer.CROSS_DOC;
                }
            }
        }

brute force approach of commenting out this code (which should not be used in my case) lead to

 String size:7126813 time: 746 ms.

which is 100 times faster, and give the exact same result.

@cbrun
Copy link
Author

cbrun commented Mar 15, 2023

I guess some logic in org.eclipse.emf.ecore.resource.impl.BinaryResourceImpl.EObjectOutputStream.saveEObject(InternalEObject, Check) could be reused as it seems to handle the cross document references case correctly, while keeping good performances.

@cbrun
Copy link
Author

cbrun commented Mar 17, 2023

After a slightly deeper analysis: the calling code resolve the reference list value, and iterates over it, and the submethod docKindMany(EObject, EReference) will do that again, leading to n*n complexity with n being the size of my list.

cbrun added a commit to cbrun/sirius-emf-json that referenced this issue Mar 17, 2023
@pcdavid
Copy link
Member

pcdavid commented Jun 24, 2024

What's even stranger is that not only docKindMany(EObject, EReference) is invoked on every turn in the loop, but its result does not even depend on the current iteration value!

serializeMultipleNonContainmentEReference is using the result of docKindMany(EObject, EReference) to decide how to serialize the reference towards each value, but for a given reference, it will make the same choice for every value, whether it is internal or not.

  • Either all values in the ref are internal, and we'll use the SAME_DOC branch for all (with the cost of recomputing docKindMany for each).
  • Or at least one value in the ref is external, and we'll use the CROSS_DOC branch for all (again incurring the cost), even for potential internal refs in case of "mixed" refs.

I'll propose a PR to clean this up.

pcdavid added a commit that referenced this issue Jun 24, 2024
This should help mostly in presence of references with high
cardinalities as it avoids an O(n*n) iteration when serialising them.

Bug: #19
Signed-off-by: Pierre-Charles David <[email protected]>
@pcdavid pcdavid linked a pull request Jun 24, 2024 that will close this issue
@pcdavid pcdavid linked a pull request Jun 24, 2024 that will close this issue
@pcdavid pcdavid self-assigned this Jun 24, 2024
pcdavid added a commit that referenced this issue Jun 28, 2024
This should help mostly in presence of references with high
cardinalities as it avoids an O(n*n) iteration when serialising them.

Bug: #19
Signed-off-by: Pierre-Charles David <[email protected]>
@sbegaudeau sbegaudeau changed the title [PERFO] Handling of cross-document refs has an important impact on performance Handling of cross-document refs has an important impact on performance Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants