-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在使用steam并行的计算时候,出现了oom #8
Comments
这是缩减后的文件,只有10m,依旧会引起我2g jvm的oom |
像是forkjoin导致内存复制的原因,建议直接输出forkjoin线程池内各内存分布看一下,应该就清楚原因了 |
这样并行流不能用了哦,200倍的内存都吃完了 |
|
感谢您的科普,我也顺利用上了流。
在实际使用并行流的时候遇到了一个oom。
大致情况是要处理一个四十多m的txt文件,内容为换行的字符串。
使用lines.flatMap(line -> Arrays.stream(line.split(" "))).parallel().distinct().count();做并发计算词语个数(数量大概在三百万,类似
一@一对一 5
一@一道 5
一@丁 6
一@七旬 8
一@万 157)。
执行期间,并行流吃光了我的内存,cpu,我jvm中的old跟eden全部爆满,方法区正常,线程正常。
串行与外部迭代只会吃掉五十m内存。
请教作者,是我并发流用的不对,还是jdk8的流有bug ---下面是重现代码
static void useStream(){
long uniqueWord = 0;
try (Stream lines =
Files.lines(Paths.get("D:\nlp\dictionary\CoreNatureDictionary.ngram.txt"), Charset.defaultCharset())){
long start = System.currentTimeMillis();
uniqueWord = lines.flatMap(line -> Arrays.stream(line.split(" "))).parallel().distinct().count();
// uniqueWord = lines.flatMap(line -> Arrays.stream(line.split(" "))).distinct().count();
System.out.println(uniqueWord);
long end = System.currentTimeMillis();
System.out.println("耗时"+ (end-start)+"ms");
}catch (IOException e){
e.printStackTrace();
}
}
The text was updated successfully, but these errors were encountered: