You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Based on a Discourse discussion here https://itensor.discourse.group/t/evaluating-overlaps-of-mpss-in-parallel/451/
it seems that the tensor contraction backend, in this case called through the inner function, can generate a lot of "garbage", that is perform a large number of allocations. In the user's case, this resulted either in a measureable slowdown of multithreaded performance, or when disabling GC (GC.enable(false)) led to a spike in memory usage followed by a delay after GC was re-enabled.
It should be noted that the calculation performed by the user was itself rather demanding, with something like a thousand inner products of length N=100 MPS being performed all at the same time. The overall speed of this was actually quite good, and the only issue here is how effectively it can be parallelized by multithreading.
Describe the solution you'd like
This is more of a "placeholder" issue to remind us to investigate allocation in the contraction engine. (Unless it is is in the inner function itself, though I doubt that given the simplicity of that function.)
Describe alternatives you've considered
Considered disabling GC or other Julia-language aspects, outside of ITensor, but my current best guess here is that there are just a lot of allocations happening at the contraction level.
Is your feature request related to a problem? Please describe.
Based on a Discourse discussion here https://itensor.discourse.group/t/evaluating-overlaps-of-mpss-in-parallel/451/
it seems that the tensor contraction backend, in this case called through the
inner
function, can generate a lot of "garbage", that is perform a large number of allocations. In the user's case, this resulted either in a measureable slowdown of multithreaded performance, or when disabling GC (GC.enable(false)
) led to a spike in memory usage followed by a delay after GC was re-enabled.It should be noted that the calculation performed by the user was itself rather demanding, with something like a thousand inner products of length N=100 MPS being performed all at the same time. The overall speed of this was actually quite good, and the only issue here is how effectively it can be parallelized by multithreading.
Describe the solution you'd like
This is more of a "placeholder" issue to remind us to investigate allocation in the contraction engine. (Unless it is is in the
inner
function itself, though I doubt that given the simplicity of that function.)Describe alternatives you've considered
Considered disabling GC or other Julia-language aspects, outside of ITensor, but my current best guess here is that there are just a lot of allocations happening at the contraction level.
Additional context
Forum discussion:
https://itensor.discourse.group/t/evaluating-overlaps-of-mpss-in-parallel/451
The text was updated successfully, but these errors were encountered: