-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Akka v1.4 Idle CPU usage increased comparing v1.3 #4434
Comments
@mralexes please see #4417 Issue was with the base Docker image, not with Akka.NET |
If that other issue turns out not to be the case, please comment here and let us know - we will want to look into it if it's really an Akka.NET issue. |
@Aaronontheweb, please check the repro again. For both 1.3.18 and 1.4.6, I'm using the same image |
Ok, will do |
Ok, I can reproduce this:
|
Fixed the solution so now clusters actually form:
|
Haven't found anything obvious in the profiler that can explain the disparity in these numbers - so we're going to need to go line by line in the commit log and see which changes might be responsible for this: I've modified your reproduction so I can copy local Akka.NET builds into the v1.4 branch and run performance comparisons that way: https://github.com/Aaronontheweb/Akka.Net-versions-issue |
Also made this https://github.com/Aaronontheweb/AkkaIdleCpuStudy - to try to measure and profile the idle CPU utilization between v1.3.18 and v1.4.6 head to head, but I wasn't able to get any definitive results while working on that yet. |
I tested with akka 1.3 and older versions of dotnet core 3.1 This hear makes 1/3 of the CPU: Maybe the deadline is negative or very short or the emptyEvent getting signaled very frequently |
akka.net/src/core/Akka.Remote/Transport/DotNetty/BatchWriter.cs Lines 180 to 181 in bb4e7d9
This will most likely make the SpinWait inside DotNetty. I think that a looped SpinWait+EventHandle in an interval of 40ms I tried to disable the buffer feature already, My test Config was:
|
I tried now disable and enable batching without success. And enabled log-transport, it started to log after the first node connection I don't have currently any more clues to test.
|
Maybe we should try a local build with that commit reverted - just need to call WriteAndFlushAsync on the transport instead of letting the batcher do it. I can check when I get home.
…Sent from my iPhone
On May 27, 2020, at 6:41 PM, Zetanova ***@***.***> wrote:
I tried now disable and enable batching without success.
And enabled log-transport, it started to log after the first node connection
CPU went much higher with only ~10 lines / s
I don't have currently any more clues to test.
akka {
remote {
dot-netty.tcp {
log-transport = true
batching {
enabled = true
flush-interval = 800ms
}
}
}
}
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@Aaronontheweb Is the behavior the same under win? |
Additional observation. Idle CPU usage is increasing over time. The screenshot contains CPU usage. NOTE Did not check it on lower 1.4.X versions |
@mralexes Are u sure thats the same base image? I made the same and had in both versions of akka 1.3 and 1.4 high idle usage. |
@mralexes yup, I'm sure. Same image. CI pipeline always pulls layers (no caching enabled) so I'm sure that base image has been pulled all the time as well. |
dotnet/coreclr#27990 - might be related |
I might have found the cause, it is related to dotnet/coreclr#26806 |
So I can validate that #4475 didn't resolve the issue - I can still recreate the problem running with the latest
|
With the changes I added here: CPU graph looks pretty normal again:
Looks like the issue is the way the |
I suspect this might have something to do with it Azure/DotNetty#522 |
Well, it's simpler than that - by separating flush and write into two separate calls in order to support batching, this effectively doubles the number of operations in the DotNetty executor pipeline when batching is disabled. I think I can work around this by changing the way the |
Info:
.Net Core 3.1
Akka 1.3.18 and Akka 1.4.6
Platforms:
Issue
After migration from 1.3.18 to 1.4.6 we've noticed that usage of Akka clustering added more CPU overhead to our deployments. After small investigation and comparison, we've found out that in newer versions Akka brings significantly more CPU usage in Idle state of the system.
Docker For Windows relative usage values:
AWS EKS absolute values (m5.large instance):
As you can see almost 4x in absolute values in EKS.
Repro:
Use code from repository, run via docker-compose, and check docker stats.
Repro contains three seed nodes with next code:
The text was updated successfully, but these errors were encountered: