Files to RDF single graph component - OutOfMemoryError #984
-
Hello, could you please help me - what should I do, when the component "Files to RDF single graph" throws OutOfMemoryError: Java heap space (after consuming 1,9GB memory)? The input file to this component has 40.63MB. |
Beta Was this translation helpful? Give feedback.
Replies: 9 comments
-
Hello, could you maybe provide the file that caused this? RDF is memory intensive, but this seems excessive. If not, could you provide at least the input file format? Files to RDF Chunked will not help here, as it loads each file from the input to a separate chunk. Therefore, it would load the one file on the input into one chunk and probably end up with the same issue. Could you also provide how much memory does your host have? There is a switch in the lp-etl executor startup script that limits the memory available, so this one might need to be raised. |
Beta Was this translation helpful? Give feedback.
-
According to docker stats, linkedpipes-executor has 5.783GiB available, but it crashes when using about 1.9GB. How could I please configure the component to use more memory? |
Beta Was this translation helpful? Give feedback.
-
OK, if JSON-LD is your source, you might also want to try JSON-LD to RDF Titanium - which uses a different parsing library. |
Beta Was this translation helpful? Give feedback.
-
Unfortunately, this component also crashed on OutOfMemoryError : Java heap space (only after longer time) ... All the time, it used around 1.7 GB memory. |
Beta Was this translation helpful? Give feedback.
-
Well, I can say that I tried converting the file out of LP-ETL using Apache Jena riot, and indeed the process took almost 4GB RAM, but finished in 10-20 seconds. Apache Jena also uses Titanium as the JSON-LD parser. Can you try doing that on the server in question to see whether it will also crash due to OOM error? If that is the case, then this is out of scope for LP-ETL, as i relies on the rdf4j and Titanium libraries for JSON-LD parsing. |
Beta Was this translation helpful? Give feedback.
-
I am not able to do that currently... but as I understand, that would not solve the problem anyway. You wrote something about a switch in the lp-etl executor startup script that limits the memory available - where can I find that? Because there is still free memory that can be used, when the component crashes.. |
Beta Was this translation helpful? Give feedback.
-
I did not realize you are using the docker version - there, I am not sure if anything can be done, as normally, it should use all the memory available. @skodapetr ? When run outside of docker, the switch is in the executor script. |
Beta Was this translation helpful? Give feedback.
-
Since Java 8, Java is capable of detecting container runtime and reflect maximum memory option. Yet, it seems that by default only 1/4 of the available memory is used as the maximum heap size. Fore more information see How To Configure Java Heap Size Inside a Docker Container. A solution is to introduce a JAVA_OPTS environment variable and thus allow user to customize Java options, including memory. Unfortunately this option is not implemented in LinkedPipes:ETL at this moment. A quick fix might be to build the Docker images locally with modified CMD command, with the |
Beta Was this translation helpful? Give feedback.
-
Thank you, we succeeded to increase the memory for the executor and it finishes the task without any problems. |
Beta Was this translation helpful? Give feedback.
Since Java 8, Java is capable of detecting container runtime and reflect maximum memory option. Yet, it seems that by default only 1/4 of the available memory is used as the maximum heap size. Fore more information see How To Configure Java Heap Size Inside a Docker Container. A solution is to introduce a JAVA_OPTS environment variable and thus allow user to customize Java options, including memory.
Unfortunately this option is not implemented in LinkedPipes:ETL at this moment.
A quick fix might be to build the Docker images locally with modified CMD command, with the
-Xmx6G
option.