Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate removal of locks in java.lang.Thread #20414

Open
babsingh opened this issue Oct 25, 2024 · 5 comments
Open

Investigate removal of locks in java.lang.Thread #20414

babsingh opened this issue Oct 25, 2024 · 5 comments

Comments

@babsingh
Copy link
Contributor

babsingh commented Oct 25, 2024

Issue

High lock contention can due to the below locks in j.l.Thread class.

JDK8, 11, 17 (Thread.java from the OpenJ9 repo is used)
private static final class ThreadLock {}
private Object lock = new ThreadLock();

JDK21+ (Thread.java from the extension repo is used)
final Object interruptLock = new Object();
  • In JDK8/11/17, there are ~15 synchronized blocks associated to the above lock.
  • In JDK21+, there are ~13 synchronized blocks associated to the lock.

By relaxing locking in some j.l.Thread APIs, 10% perf improvement was observed in a Netty application run through the Quarkus framework.

Goal

There is an opportunity to improve the perf of OpenJ9's j.l.Thread APIs by either removing the lock in some cases or utilizing separate locks for some critical sections. This issue has been opened to initiate a discussion on the best approach to improve perf by not hindering functionality.

Existing case (see #17251): We are improving Thread.getState by removing the need of a lock and complex work to evaluate the state.

Potential new cases:

  • Remove the lock in places where we just read the fields (e.g. isAlive, isDead).
  • The global lock should only be used in start, join and other key methods. On the other hand, the interruptLock should only be used for interrupt specific work.
  • Some methods don't need to be thread-safe, but the above global lock is used (e.g. setPriority, setDaemon).
Copy link

Issue Number: 20414
Status: Open
Recommended Components: comp:vm, comp:gc, comp:jclextensions
Recommended Assignees: jasonfengj9, babsingh, gacholio

@ThanHenderson
Copy link
Contributor

A few of these synchronized(interruptLock) were quite recently added, mostly to resolve crashes/race conditions. (e.g. PR 1 and PR 2). Those PRs use the lock in, I think, appropriate ways; locking w.r.t. the thread interrupt implementation.

However, PR 1 states that the crashes occurred due to unexpected/inconsistent state of the eetop/threadRef value. In our Thread.java extension repo code, we are protecting most (though not all) eetop reads with the synchronized(interruptLock), do we need to be using the interruptLock to guard eetop access?

In JDK21+, there are ~13 synchronized blocks associated to the lock.

This value may be over estimating the problem slightly since synchronized blocks are reentrant and some of these paths overlap. (e.g. isDead() is only ever called from getState()).

By relaxing locking in some j.l.Thread APIs, 10% perf improvement was observed in a Netty application run through the Quarkus framework.

Which locks were removed for this testing?

babsingh added a commit to babsingh/openj9 that referenced this issue Oct 25, 2024
threadRef = NO_REF assigned will happen once, and similarly, the
started field is also set once during initialization.

A user of these methods will probably invoke them multiple times
until the desired result is achieved. A delay in getting the most
up-to-date value should not hinder the functionality of these
methods.

To sum up, removing synchronization in these methods improves perf
by reducing lock contention without hindering the functionality.

Related: eclipse-openj9#20414

Signed-off-by: Babneet Singh <[email protected]>
@babsingh
Copy link
Contributor Author

In JDK17, contention is seen in the following code path:

[]-----------------  24  1      0   0.00  48.76           0       29728    io/vertx/core/http/impl/AssembledHttpResponse.release()Z
 +[]---------------  25  1      0   0.00  48.76           0       29728    io/netty/buffer/AbstractReferenceCountedByteBuf.release()Z
   +[]-------------  26  1      0   0.00  48.76           0       29728    io/netty/buffer/AbstractReferenceCountedByteBuf.handleRelease(Z)Z
     +[]-----------  27  1      0   0.00  48.76           0       29728    io/netty/buffer/PooledByteBuf.deallocate()V
       +[]---------  28  1      0   0.00  48.76           0       29728    io/netty/util/Recycler$DefaultHandle.unguardedRecycle(Ljava/lang/Object;)V
         +[]-------  29  1      0   0.00  48.76           0       29728    io/netty/util/Recycler$LocalPool.release(Lio/netty/util/Recycler$DefaultHandle;Z)V
           +[]-----  30  1      0   0.00  48.76           0       29728    io/netty/util/Recycler$LocalPool.isTerminated(Ljava/lang/Thread;)Z
             +[]---  31  1      0   0.00  48.76           0       29728    java/lang/Thread.isAlive()Z
               +[]-  32  1      0  48.76  48.76       29728       29728    (java/lang/Thread$ThreadLock)

This contention should be fixed by #20415.

@babsingh
Copy link
Contributor Author

In JDK21, contention is seen in a different code path:

[]-------------------   0  1      1   0.00  33.41           0     2009748    0000dd0b_executor-thread-1_pidtid
 +[]-----------------   1  1      0   0.00  33.41           0     2009748    java/lang/Thread.run()V
   +[]---------------   2  1      0   0.00  33.41           0     2009748    java/lang/Thread.runWith(Ljava/lang/Object;Ljava/lang/Runnable;)V
     +[]-------------   3  1      0   0.00  33.41           0     2009748    io/netty/util/concurrent/FastThreadLocalRunnable.run()V
       +[]-----------   4  1      0   0.00  33.41           0     2009748    org/jboss/threads/ThreadLocalResettingRunnable.run()V
         +[]---------   5  1      0   0.00  33.41           0     2009748    org/jboss/threads/DelegatingRunnable.run()V
           +[]-------   6  1      0   0.00  33.41           0     2009748    org/jboss/threads/EnhancedQueueExecutor$ThreadBody.run()V
             +[]-----   7  1      0   0.00  33.41           0     2009748    java/lang/Thread.interrupted()Z
               +[]---   8  1      0   0.00  33.41           0     2009748    java/lang/Thread.getAndClearInterrupt()Z
                 +[]-   9  1      0  33.41  33.41     2009748     2009748    (java/lang/Object)

A synchronized block was added in getAndClearInterrupt to fix #15465. We might not be able to remove synchronization, but we can relax it in the below manner:

    boolean getAndClearInterrupt() {
        boolean oldValue = interrupted;
        if (oldValue) {
            synchronized(interruptLock) { <--- lock will only be acquired in the true case
                oldValue = interrupted;
                if (oldValue) {
                    interrupted = false;
                    clearInterruptEvent();
                }
            }
        }
        return oldValue;
    }

babsingh added a commit to babsingh/openj9 that referenced this issue Oct 25, 2024
threadRef = NO_REF assignment will happen once, and similarly, the
started field is also set once during initialization.

A user of these methods will probably invoke them multiple times
until the desired result is achieved. A delay in getting the most
up-to-date value should not hinder the functionality of these
methods.

To sum up, removing synchronization in these methods improves perf
by reducing lock contention without hindering the functionality.

Related: eclipse-openj9#20414

Signed-off-by: Babneet Singh <[email protected]>
babsingh added a commit to babsingh/openj9 that referenced this issue Oct 25, 2024
threadRef = NO_REF assignment will happen once, and similarly, the
started field is also set once during initialization.

A user of these methods will probably invoke them multiple times
until the desired result is achieved. A delay in getting the most
up-to-date value should not hinder the functionality of these
methods.

To sum up, removing synchronization in these methods improves perf
by reducing lock contention without hindering the functionality.

Reference to PR 1FJMO7Q has been removed and a comment related to
ThreadGroup has been updated since the code seems to have evolved
since the comments were written.

Related: eclipse-openj9#20414

Signed-off-by: Babneet Singh <[email protected]>
babsingh added a commit to babsingh/openj9 that referenced this issue Oct 25, 2024
threadRef = NO_REF assignment will happen once, and similarly, the
started field is also set once during initialization.

A user of these methods will probably invoke them multiple times
until the desired result is achieved. A delay in getting the most
up-to-date value should not hinder the functionality of these
methods.

To sum up, removing synchronization in these methods improves perf
by reducing lock contention without hindering the functionality.

Reference to PR 1FJMO7Q has been removed and a comment related to
ThreadGroup has been updated since the code seems to have evolved
since the comments were written.

Related: eclipse-openj9#20414

Signed-off-by: Babneet Singh <[email protected]>
babsingh added a commit to babsingh/openj9 that referenced this issue Oct 25, 2024
Made threadRef volatile so that the most up-to-date value is seen
by all the threads.

threadRef = NO_REF assignment will happen once, and similarly, the
started field is also set once during initialization.

A user of these methods will probably invoke them multiple times
until the desired result is achieved. A delay in getting the most
up-to-date value should not hinder the functionality of these
methods.

To sum up, removing synchronization in these methods improves perf
by reducing lock contention without hindering the functionality.

Reference to PR 1FJMO7Q has been removed and a comment related to
ThreadGroup has been updated since the code seems to have evolved
since the comments were written.

Related: eclipse-openj9#20414

Signed-off-by: Babneet Singh <[email protected]>
@ThanHenderson
Copy link
Contributor

ThanHenderson commented Oct 25, 2024

I'm not sure as to why we would have had locks around eetop, especially (not?) interruptLock. Since eetop is volatile, reads-from and writes-to the variable are atomic and done infrequently. I think this may have been residual of a previous implementation when eetop wasn't marked as volatile (like how threadRef was until #20415).

I think we can hoist the reads and writes of eetop, started, etc, out of the synchronized blocks, and remove them the synchronized blocks where appropriate. I can get some PRs open for this once #20415 the is merged.

babsingh added a commit to babsingh/openj9-openjdk-jdk21 that referenced this issue Oct 28, 2024
Lock contention in Thread.getAndClearInterrupt is reduced by
acquiring the lock only when the value of the Thread.interrupted
field is true.

Related: eclipse-openj9/openj9#20414

Signed-off-by: Babneet Singh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants