-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression from v9.7 -> v9.8-v9.11 #4166
Comments
how many workers are you using ? |
the hint is incomplete. Can you improve it ? |
I have consistent results with these parameters: fix_variables_to_their_hinted_value:true,num_workers:10,use_feasibility_jump:false,use_rins_lns:false,use_feasibility_pump:false,cp_model_probing_level:0 It seems to solve consistently in 6-7s. |
The minimal example above does not set them explicitly, so I assume it's determined by the number of cores on the system? In my case, that's 8.
Can you elaborate? Should one either give no hints at all or hints to all variables?
The configuration Can you comment further on the configuration options you have set to force consistency? It does look like search either succeeds quickly, or some search strategy leads the solver astray entirely (because where the usual ~8s solution is missed, solution are often not even found with a significantly larger budget. |
The closer to completeness the hint is, the less effort is needed in search.
We do process complete feasible hints differently.
Laurent Perron | Operations Research | ***@***.*** | (33) 1 42 68 53
00
Le lun. 1 avr. 2024 à 17:51, Hanno Becker ***@***.***> a
écrit :
… how many workers are you using ?
The minimal example above does not set them explicitly, so I assume it's
determined by the number of cores on the system? In my case, that's 8.
the hint is incomplete. Can you improve it ?
Can you elaborate? Should one either give no hints at all or hints to all
variables?
I have consistent results with these parameters:
fix_variables_to_their_hinted_value:true,num_workers:10,use_feasibility_jump:false,use_rins_lns:false,use_feasibility_pump:false,cp_model_probing_level:0
The configurationfix_variables_to_their_hinted_value:true does not seem
like an option in my case, because the hints really are only hints -- I set
them based on an expectation that for the majority of variables of a
certain kind the hinted property will be true, but there will be exceptions
(in more detail: SLOTHY can interleave neighbouring loop iterations, and
there are booleans indicating if an instruction is pulled forward into the
previous iteration (e.g. an early load), or deferred to the next iteration
(e.g. late store) -- most instructions will stay in their original
iteration and hence the tool is hinting at that, but without early/late
instructions altogether, the tool would be much less powerful).
Can you comment further on the configuration options you have set to force
consistency? It does look like search either succeeds quickly, or some
search strategy leads the solver astray entirely (because where the usual
~8s solution is missed, solution are often not even found with a
significantly larger budget.
—
Reply to this email directly, view it on GitHub
<#4166 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUPL3M5G7FGLMAHO44BJJLY3F667AVCNFSM6AAAAABFQV5KC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZQGAZTMNJWGI>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
There are usually simple solutions which follow the (incomplete) hints, but they won't minimize the given objective (stall minimization, in SLOTHY's case) -- are those still useful hints in your experience, or should they be removed? |
can you try num_workers:24 ? |
or just num_workers:1,search_branching:FIXED_SEARCH |
What is the takeaway here? When should I consider setting I will start a SLOTHY CI run using |
Following suggestions in google/or-tools#4166
Following suggestions in google/or-tools#4166
@lperron Unfortunately, unconditionally setting How would you suggest to proceed here? Do you have a sense of what v9.7->v9.8 change might have triggered this performance change? Has some new search strategy been added in v9.8 that might lead the solver astray in the models produced by SLOTHY? |
OK. I have no quick solution. If you could send me a collection of models, I can integrate those into out benchmark suite. |
@lperron I will prepare a set of models representative of SLOTHY workloads and share them in the coming days. |
@lperron @Mizux I have exported some of the models exercised in the SLOTHY CI here: https://github.com/slothy-optimizer/slothy/tree/ci_models/paper/scripts/models Performance numbers as observed on my local Apple M1 are in https://github.com/slothy-optimizer/slothy/blob/ci_models/paper/scripts/models/results.txt. Some of them are solved/refuted very quickly, so one should probably hand-select a few that can be solved in seconds-minutes. Please let me know if this is useful to you, or what kind of models/format you would prefer otherwise. |
can you try with these parameters ?
|
I ran all the models with 15 runs per model with different settings 16 workers, 20s:
12 workers, 20s,
|
@lperron Thank you for investigating! What do your measurements tell you? |
The second set of parameters is stable, and solve all the set of problems reliably. |
@lperron Thank you very much for investigating. Are you going to make changes in CP-SAT to make the behaviour the default, or what are next steps? |
No.
Try setting these parameters in your code, and t ll me how it performs.
Le mar. 23 avr. 2024, 05:15, Hanno Becker ***@***.***> a
écrit :
… @lperron <https://github.com/lperron> Thank you very much for
investigating. Are you going to make changes in CP-SAT to make the
behaviour the default, or what are next steps?
—
Reply to this email directly, view it on GitHub
<#4166 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUPL3I2OYNWR5VWZRBGWT3Y6XG4NAVCNFSM6AAAAABFQV5KC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZRGMZTANJWHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@lperron I'll run the CI on the proposed parameters and get back to you. |
No progress on our side. This is on my radar.
Laurent Perron | Operations Research | ***@***.*** | (33) 1 42 68 53
00
Le mar. 2 juil. 2024 à 07:43, Hanno Becker ***@***.***> a
écrit :
… @lperron <https://github.com/lperron> @Mizux <https://github.com/Mizux>
Gentle ping.
—
Reply to this email directly, view it on GitHub
<#4166 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUPL3JTZRWSMBJ2XMOEPQLZKI4YTAVCNFSM6AAAAABFQV5KC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBRHE4DIOJQHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I re-ran everything this morning parameters: num_workers:16,max_time_in_seconds:15,cp_model_probing_level:0,linearization_level:0 I run each example 5 times. Everything is solved optimally (optimal or infeasible), except slothy_ci_fft_1712650422585 which is feasible all 5 runs, and ntt_dilithium_123_45678_a55_1712651065788 which is unknown 2 times out of 5. |
@lperron With those parameters (excluding timeout, which SLOTHY sets itself), the CI passes for the first time in slothy-optimizer/slothy#57, though still at a large performance penalty compared to v9.7: For example, the examples_ntt_kyber_dilithium_neon_core job runs in 30min instead of 20min. I'll re-run to see if this is consistent. Is there a way to restore the v9.7 behaviour through specific parameters, or was the switch from v9.7 to v9.8 deeper? |
Good news.
We are in a optimizing spree. Hopefully, it will show in your benchmarks.
Laurent Perron | Operations Research | ***@***.*** | (33) 1 42 68 53
00
Le lun. 8 juil. 2024 à 16:47, Hanno Becker ***@***.***> a
écrit :
… @lperron <https://github.com/lperron> With those parameters (excluding
timeout, which SLOTHY sets itself), the CI passes for the first time in
slothy-optimizer/slothy#57
<slothy-optimizer/slothy#57>, though still at a
large performance penalty compared to v9.7: For example, the
examples_ntt_kyber_dilithium_neon_core job runs in 30min instead of 20min.
I'll re-run to see if this is consistent.
—
Reply to this email directly, view it on GitHub
<#4166 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUPL3PMOC4OR6AC6TVT25LZLKQ7RAVCNFSM6AAAAABFQV5KC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJUGI4TQOJZHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@lperron Thanks! Should I read this as "Wait for 9.11, it may solve your issues"? |
it could solve your issues, ..., or not :-)
Laurent Perron | Operations Research | ***@***.*** | (33) 1 42 68 53
00
Le lun. 8 juil. 2024 à 17:46, Hanno Becker ***@***.***> a
écrit :
… @lperron <https://github.com/lperron> Thanks! Should I read this as "Wait
for 9.11, it may solve your issues"?
—
Reply to this email directly, view it on GitHub
<#4166 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUPL3MYUX3I4QVP3AGRWTLZLKX4LAVCNFSM6AAAAABFQV5KC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJUGQ4DINZTGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@lperron Ok, let's wait and see :-) |
Hi,
Can you try again ?
I just ran our benchmarks on main (16 threads, 15s). I have one example
that times out (ntt_dilithium_123_45678_a55_1712651065788) and one that
only finds a feasible solution (slothy_ci_fft_1712650422585).
Thanks
Laurent Perron | Operations Research | ***@***.*** | (33) 1 42 68 53
00
Le lun. 8 juil. 2024 à 17:53, Hanno Becker ***@***.***> a
écrit :
… @lperron <https://github.com/lperron> Ok, let's wait and see :-)
—
Reply to this email directly, view it on GitHub
<#4166 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACUPL3J45QUAUFOL3VQZ4L3ZLKYXNAVCNFSM6AAAAABFQV5KC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJUGUYDKOBRGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
What version of OR-Tools and what language are you using?
Version: v9.7, v9.8, v9.9, v9.10, v9.11
Language: Python
Which solver are you using (e.g. CP-SAT, Routing Solver, GLOP, BOP, Gurobi)
CP-SAT
What operating system (Linux, Windows, ...) and version?
Apple M1 Pro, MacOS Sonoma Version 14.3.1
What did you do?
Updated OR-Tools from v9.7 to v9.8 and v9.9 when used as the backend for the SLOTHY assembly superoptimizer.
What did you expect to see
CP-SAT performance that is similar or better in terms of runtime and consistency.
What did you see instead?
Significant inconsistency in the runtime of CP-SAT.
Steps to reproduce:
run_model.py
:Here are the outputs on my local machine (see above):
Anything else we should know about your project / environment
logs/ntt_dilithium_45678_a55_model.txt
viapython3 example.py --examples ntt_dilithium_45678_a55 --log-model
based off the SLOTHYmain
branch.If you need any more information, please let me know.
The text was updated successfully, but these errors were encountered: