Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML#nodes(String) Method is Extremely Slow on Large Files #288

Closed
volodya-lombrozo opened this issue Jan 9, 2025 · 10 comments
Closed

XML#nodes(String) Method is Extremely Slow on Large Files #288

volodya-lombrozo opened this issue Jan 9, 2025 · 10 comments

Comments

@volodya-lombrozo
Copy link
Contributor

volodya-lombrozo commented Jan 9, 2025

I have one large XML file and run the following java code on it:

private static Collection<XML> findDefects(final XML report) {
  return report.nodes("/defects/defect");
}

It runs extremely long (minutes). Could you speed up this method, please?

Here is my XML:
Pointer.xmir.txt

XML is instance of XMLDocument.

@volodya-lombrozo
Copy link
Contributor Author

@yegor256 Could you have a look, please?

@yegor256
Copy link
Member

yegor256 commented Jan 9, 2025

@volodya-lombrozo why do you think it may be faster?

@volodya-lombrozo
Copy link
Contributor Author

@yegor256 I showed it here: objectionary/lints#210

@yegor256
Copy link
Member

yegor256 commented Jan 9, 2025

@volodya-lombrozo how can we fix here? Implement our own XPath machine? )

@volodya-lombrozo
Copy link
Contributor Author

@yegor256 I haven't checked it yet.

@yegor256
Copy link
Member

@volodya-lombrozo jcabi-xml is simply an adapter for XPath Factory from JDK. We don't make it any slower and we can't make it any faster.

@volodya-lombrozo
Copy link
Contributor Author

@yegor256 I'm not sure here. For now, it seems that you are right. However, I took a fast look on the code and have found that XMLDocument has tough synchronisations that might kill benefits of using parallel execution:

synchronized (this.cache) {
    return (T) xpath.evaluate(query, this.cache, qname);
}

This issue requires at least some investigation.

@volodya-lombrozo
Copy link
Contributor Author

btw 81% of time we usually spend in this part of the code I showed above.

@yegor256
Copy link
Member

@volodya-lombrozo this one may be closed, I believe, in favor of https://github.com/volodya-lombrozo/xnav

@volodya-lombrozo
Copy link
Contributor Author

@yegor256 Yes, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants