-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚡ Collection of optimizations and a bug fix for charge_distribution_surface
#325
Conversation
…correct behaviour
…de `const noexcept`
…ject now updates the storage selectively
…variant) since the "leading zeroes" check can be omitted when the charge index was incremented.
…arge distribution can be declared physically invalid
…e the energy of the current charge distribution if it is physically valid
… empty charge distribution now counts as valid. I think this is acceptable
…er to make QuickSim compatible with the "update energy only if physically valid" optimisation
Looks like I missed some tests still. Probably just some silliness |
@wlambooy Thank you for your efforts! As soon as it passes the tests, I will take a closer look at it. Please note that |
I missed some consequences of making the empty layout physically valid, hopefully everything passes now. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #325 +/- ##
==========================================
- Coverage 96.04% 96.04% -0.01%
==========================================
Files 102 102
Lines 10075 10059 -16
==========================================
- Hits 9677 9661 -16
Misses 398 398
Continue to review full report in Codecov by Sentry.
|
I see what you mean, but this is not the best style. Using the keyword |
My assumption that the C++ compiler catches this turns out to be incorrect, and our example in which a dereferenced pointer to a object is changed show this. I made sure to verify the logical constness in a new commit. While trying to improve the code coverage, I stumbled on a layout that generates no charge distributions for ExGS, but two for QuickExact. The test looks as follows: TEMPLATE_TEST_CASE(
"ExGS simulation of two SiDBs placed directly next to each other with non-realistic relative permittivity", "[exhaustive-ground-state-simulation]",
(cell_level_layout<sidb_technology, clocked_layout<cartesian_layout<siqad::coord_t>>>),
(charge_distribution_surface<cell_level_layout<sidb_technology, clocked_layout<cartesian_layout<siqad::coord_t>>>>))
{
TestType lyt{};
lyt.assign_cell_type({1, 3, 0}, TestType::cell_type::NORMAL);
lyt.assign_cell_type({2, 3, 0}, TestType::cell_type::NORMAL);
const quickexact_params<TestType> params{sidb_simulation_parameters{2, -0.32, 1.0e-3}};
const auto simulation_results = quickexact<TestType>(lyt, params);
CHECK(simulation_results.charge_distributions.empty());
} A degenerate charge distribution is returned by QuickExact with a system energy of -3472.6959461051524. Perhaps QuickExact is not defined on such non-realistic relative permittivity values? |
Thank you for your efforts! 🙏 const quickexact_params<TestType> params{sidb_simulation_parameters{2, -0.32, 1.0e-3},
quickexact_params<TestType>::automatic_base_number_detection::OFF}; |
…ysically valid" This reverts commit 356053f.
@wlambooy Thank you very much for your effort! Just klick on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you so much for your efforts! 🙏
st.additional_simulation_parameters.emplace_back("iteration_steps", ps.interation_steps); | ||
st.additional_simulation_parameters.emplace_back("alpha", ps.alpha); | ||
st.physical_parameters = ps.phys_params; | ||
|
||
if (lyt.num_cells() == 0) | ||
{ | ||
return st; | ||
} | ||
|
||
st.charge_distributions.reserve(ps.interation_steps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you didn't name this variable st
but now that there have been changes made, it seems to be an inconsistency to the other two ground state algorithms where this variable is called result
, a more fitting name in my opinion. May I ask you to adjust the naming? Many thanks!
@@ -93,7 +99,7 @@ sidb_simulation_result<Lyt> quicksim(const Lyt& lyt, const quicksim_params& ps = | |||
charge_lyt.assign_base_number(2); | |||
charge_lyt.assign_all_charge_states(sidb_charge_state::NEGATIVE); | |||
charge_lyt.update_after_charge_change(dependent_cell_mode::VARIABLE); | |||
const auto negative_sidb_indices = charge_lyt.negative_sidb_detection(); | |||
const auto& negative_sidb_indices = charge_lyt.negative_sidb_detection(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the reference? Due to C++'s return value optimization, it should be equivalent performance-wise to have const auto
instead of const auto&
. However, const auto
is safer because it doesn't fall prey to the object running out of scope and disappearing. I prefer using const auto
unless there are good reasons against it.
@@ -105,14 +105,14 @@ void time_to_solution(Lyt& lyt, const quicksim_params& quicksim_params, const ti | |||
sidb_simulation_result<Lyt> simulation_result{}; | |||
if (tts_params.engine == exhaustive_sidb_simulation_engine::QUICKEXACT) | |||
{ | |||
const quickexact_params<Lyt> params{quicksim_params.phys_params}; | |||
st.algorithm = "QuickExact"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should st
also be result
here?
const quickexact_params<Lyt> params{quicksim_params.phys_params}; | ||
st.algorithm = "QuickExact"; | ||
simulation_result = quickexact(lyt, params); | ||
st.algorithm = "Exhaustive Ground State Simulation"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we use ExGS
as the algorithm name here. @Drewniok can you confirm this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (old_params.lat_a != params.lat_a || old_params.lat_b != params.lat_b || old_params.lat_c != params.lat_c) | ||
{ // lattice changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either move the comment directly above the if
statement (if it concerns the condition) or within the block (if it concerns the statements inside the condition).
} | ||
else if (old_params.epsilon_r != params.epsilon_r || old_params.lambda_tf != params.lambda_tf) | ||
{ // potential changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
@wlambooy Thanks again for your efforts! @marcelwa just asked me to check again if your changes do not increase the simulation runtimes. However, when running our recently added benchmark script (fiction -> test -> benchmark -> simulation.cpp), a careful analysis of the results shows that the runtimes increase significantly (~10%), which is unacceptable. As far as I can see, this is not caused by the bugfix, but rather by the other changes. |
"Unacceptable" is maybe a bit harsh a statement. We aim for the most performant code possible for any given scenario and we indeed take pride in our achievements. @wlambooy if the benchmark code @Drewniok mentioned helps you figure out where exactly the runtime increase was caused, we'd appreciate a fix. Many thanks! |
Thank you for doing the analysis @Drewniok! I will try to uncover what changes makes this performance decrease. |
Also, @Drewniok, can I ask what your benchmark analysis process looks like so that I can replicate it? I see the simulation benchmark file contains a comment with |
@wlambooy See https://github.com/catchorg/Catch2/blob/devel/docs/benchmarks.md for a definition of samples, iterations, etc. In our case, the crossing gate will be simulated 100 times due to 100 samples and only 1 iteration each. |
I think @wlambooy's question was where the output comes from. It will be automatically generated by the benchmark script. Simply pass |
I see, I was running the benchmarks from within CLion so I was getting some XML output that didn't look like the output in the comment. I have to say that on my system at least, I'm always getting quite a spread in the benchmark results, and usually the more often I run it without rebuilding the better the times will get. You can see this in the benchmark results below, which were performed directly in sequence (look at the
|
It would be best to run the benchmarks from the command line while having all other tasks/programs closed. Particularly your web-browser and CLion are running expensive background tasks that can impact the measurements. |
After doing a couple of benchmark runs with pre-built binaries in a more processor-usage sterile environement, I am able to confirm a performance difference with the following findings: The QuickExact time is slower (+5.4% runtime on average) while QuickSim is faster (-7.1% runtime on average). All benchmark runs were consistent in the differences: each QuickExact benchmark run of the main branch was faster than each QuickExact benchmark run of this branch, and each QuickSim benchmark run of the main branch was slower than each QuickSim benchmark run of this branch. I'll try to figure out why this is. |
@wlambooy Thanks again for your efforts! I know, sometimes it is not so easy to find out why the code is slower. Take your time and don't hesitate to contact us. |
I agree that it would be very helpful to have the bugfix available as quickly as possible. |
I was working on more pressing things the last week so I haven't gotten into a code speed investigation. I put the bugfixes in a new PR: #347 |
…#325 (#347) * 🐛 reducing the charge index to a value with leading zeroes now gives correct behaviour (cherry picked from commit e1fe34a) * 🐛 `bounding_box` does not update the dimension sizes for cubic coordinates (cherry picked from commit fc01d74) * ⚡ optimized increase_charge_index_by_one (with and enum this time) * 🎨 remove `const` specifier from functions that are not logically const * ✅ add `const` specifier from functions that are logically const when considering a change of charge index as such * ✅ add `const` specifier from functions that are logically const when considering a change of charge index as such * ✅ remove `const` specifier from functions that are not logically const when considering a change of charge index as such * 📝 implemented Jan's suggestions * ✅ remove `const` specifier * 🐛 The TTS algorithm now sets the simulator name correctly for ExGS (to `ExGS`) --------- Co-authored-by: Marcel Walter <[email protected]>
This ran out of sync and doesn't seem to be necessary anymore according to @Drewniok. |
Description
This PR introduces a collection of optimisations, mainly with regard to
charge_distribution_surface
.The commit messages describe the changes that are made.
Most prominently, we have:
validity_check
function now returns as soon as a verdict can be made.const
apparently. This may be applicable in other files as well. Currently I applied the change incharge_distribution_surface.hpp
andcell_level_layout.hpp
.quicksim.hpp
. Also theupdate_after_charge_change()
call before calling the threads can be omitted, since in each thread we first make such a call before checking physical validity. Hence also we have a proper initialisation of the local potentials and system energy before callingadjacent_search
.Checklist: