-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent Thread Oversubscription in Cluster Setting #4
base: dev
Are you sure you want to change the base?
Conversation
yet the number of threas is still way over 8 Need to understand this better for good command line arguments. |
In the current state, we get an average time of 20-30 seconds per bootstrap over the date range on the test date, given I submitted the job with
This is about the same runtime I have on my local machine where I've wrapped only the handful of lines that do the actual deconvolution in the From the |
The When run locally on the end user machine - e.g. my OSX Machine - the control does nothing. So, Lollipop runs fine. |
Should we allow for fine-grained thread control on the user-side ?Potential Benefits:
Negative Impact:
==== Testing runtime with Ivan's runtime for this weeks processing with ConclusionOn the test data Let's stick with |
d2587fd
to
c2dcbc1
Compare
In a cluster setting, thread oversubscription can lead to significant performance degradation and resource contention for running the deconvolution with scipy.optimize. This commit addresses this issue by utilizing the `threadpoolclt` library to limit the number of threads to 1. This change ensures that each process uses only the allocated resources, preventing contention and improving overall cluster stability.
c2dcbc1
to
aa830bf
Compare
@DrYak Let's merge ? |
Aim: Prevent thread oversubscription in cluster settings in deconvolution by explicit threading control.
The nested parallelism that
Introduced leads to fantastic speedups yet may lead to an oversubscription of threads in a cluster setting where the number of threads is rigorously enforced.
The nested parallelism is due to the
python.multiprocessing
and the nestedscipy.optimise
within the deconvolution/regression. The internalscipy.optimise
is known to grab threads in such a setting, which may lead to a stalling on the cluster.Objectives:
--cpus-per-task
threads=1
works well on cluster and local)