Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Wigner Function and Improve make_grid with @njit for Better Performance and Memory Efficiency #504

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

JakeKitchen
Copy link

@JakeKitchen JakeKitchen commented Oct 15, 2024

User description

Optimized Wigner Function Iterative Computation

Context:

The _wigner_discretized_iterative function calculates the Wigner function iteratively,
which is essential for simulating quantum states in continuous variable quantum mechanics.
The original implementation had redundant operations, recalculations, and excessive memory
allocations, leading to performance bottlenecks. The make_grid function was also optimized
with @njit to further enhance performance. This update improves both mathematical efficiency
and memory management.

Description of the Change:

  1. Precompute Exponential Terms: The exponential grid is computed only once and reused to
    avoid redundant calculations in each iteration.
  2. Precompute Square Roots: All square roots required for the computation are calculated at
    the beginning and stored in an array. This reduces overhead inside nested loops.
  3. Efficient Matrix Swapping: Instead of creating new matrices in every loop iteration, the
    function reuses memory by swapping matrices, which reduces memory allocations.
  4. Optimized make_grid Function:
    • The make_grid function was updated to use @njit to accelerate its performance.
    • Memory allocations for the Q and P coordinate matrices are optimized to reduce overhead.
    • A precomputed sqrt_factor is used to avoid redundant calculations inside nested loops.
    • By leveraging @njit, the function is now compliant with Numba’s JIT compilation, allowing
      faster and more efficient execution.

Benefits:

  1. Improved Mathematical Efficiency:

    • The function minimizes the number of floating-point operations inside loops by precomputing
      values such as exponential terms and square roots.
    • The iterative Wigner calculation avoids recalculating these values multiple times, making
      the function more computationally efficient.
    • Memory-intensive operations are reduced by swapping matrices rather than allocating new ones,
      ensuring cache-friendlier execution.
  2. Better Performance:

    • Memory Reuse: By reusing the same memory blocks with matrix swapping, the function avoids
      costly memory allocations, improving speed.
    • Optimized make_grid: Using @njit in make_grid improves both speed and memory efficiency,
      enabling seamless integration with the rest of the optimized functions.
  3. Numba Compliance:

    • The optimized function ensures type safety and alignment with Numba’s @njit constraints,
      avoiding common type-related issues.

Possible Drawbacks:

  • Increased Memory Usage at Startup: Precomputing the square roots and exponential grid slightly
    increases memory usage at the beginning, but this is offset by performance gains during execution.

Related GitHub Issues:

NA


PR Type

enhancement, bug_fix


Description

  • Optimized the Wire class in tensors.py by using field for attribute initialization and improved dictionary comprehensions for better performance.
  • Simplified the shape calculation process by removing unnecessary sorting functions and directly combining shapes.
  • Enhanced the make_grid function in wigner.py by preallocating arrays and optimizing the computation of coordinate matrices.
  • Improved the _wigner_discretized_iterative function by precomputing exponential terms and square roots, reducing redundant calculations, and reusing matrices to enhance memory efficiency.

Changes walkthrough 📝

Relevant files
Enhancement
tensors.py
Optimize tensor network wire and shape handling                   

mrmustard/math/tensor_networks/tensors.py

  • Improved initialization of Wire class attributes using field.
  • Optimized wire dictionary initialization with dictionary
    comprehensions.
  • Simplified shape calculation by removing unnecessary function.
  • Converted keys to lists in AdjointView and DualView initializations.
  • +31/-44 
    wigner.py
    Enhance Wigner function computation efficiency                     

    mrmustard/physics/wigner.py

  • Optimized make_grid function with preallocated arrays.
  • Precomputed exponential terms and square roots in
    _wigner_discretized_iterative.
  • Improved memory efficiency by reusing matrices.
  • +34/-20 

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    Copy link
    Contributor

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🏅 Score: 92
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Performance Optimization
    Verify that the optimizations in the _wigner_discretized_iterative function, including precomputation of exponential terms and square roots, and matrix reuse, actually improve performance without introducing numerical instability.

    Code Refactoring
    Review the changes in the shape method of the Tensor class to ensure that the new implementation correctly handles all possible input scenarios and maintains the expected behavior.

    Copy link
    Contributor

    codiumai-pr-agent-pro bot commented Oct 15, 2024

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Performance
    Use NumPy's vectorized operations to improve performance in grid creation

    Consider using NumPy's vectorized operations instead of nested loops in the
    make_grid function to improve performance. This can be achieved by using np.meshgrid
    to create the coordinate matrices.

    mrmustard/physics/wigner.py [29-48]

     @njit
     def make_grid(q_vec: np.ndarray, p_vec: np.ndarray, hbar: float):
         r"""Returns two coordinate matrices `Q` and `P` from coordinate vectors
         `q_vec` and `p_vec`, along with the grid over which Wigner functions can be
         discretized.
         """
    -    n_q = q_vec.size
    -    n_p = p_vec.size
    -    Q = np.empty((n_q, n_p), dtype=np.float64)
    -    P = np.empty((n_q, n_p), dtype=np.float64)
    +    Q, P = np.meshgrid(q_vec, p_vec, indexing='ij')
         sqrt_factor = 1.0 / np.sqrt(2.0 * hbar)
    -
    -    for i in range(n_q):
    -        q = q_vec[i]
    -        for j in range(n_p):
    -            p = p_vec[j]
    -            Q[i, j] = q
    -            P[i, j] = p
    -
         grid = (Q + 1j * P) * sqrt_factor
         return Q, P, grid
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: The suggestion to use NumPy's meshgrid function instead of nested loops significantly improves performance by leveraging vectorized operations, which is a well-known optimization technique in numerical computing.

    9
    Use in-place operations and preallocate arrays to reduce memory allocations and improve performance

    Consider using in-place operations and preallocating arrays to reduce memory
    allocations and improve performance in the _wigner_discretized_iterative function.

    mrmustard/physics/wigner.py [155-169]

     @njit
     def _wigner_discretized_iterative(rho, q_vec, p_vec, hbar):  # pragma: no cover
         """Optimized iterative computation of the Wigner function."""
         cutoff = len(rho)
         Q, P, grid = make_grid(q_vec, p_vec, hbar)
         Wmat = np.zeros((2, cutoff) + grid.shape, dtype=np.complex128)
     
         # Precompute the exponential term to avoid recalculating it.
    -    exp_grid = np.exp(-2.0 * np.abs(grid) ** 2) / np.pi
    +    np.exp(-2.0 * np.abs(grid) ** 2, out=Wmat[0, 0])
    +    Wmat[0, 0] /= np.pi
     
    -    # Initialize Wmat and W with the |0><0| component.
    -    Wmat[0, 0] = exp_grid
    -    W = rho[0, 0].real * Wmat[0, 0].real
    +    # Initialize W with the |0><0| component.
    +    W = np.real(rho[0, 0] * Wmat[0, 0])
     
         # Precompute square roots to avoid repetitive calculations.
    -    sqrt_n = np.array([np.sqrt(n) for n in range(cutoff)], dtype=np.float64)
    +    sqrt_n = np.sqrt(np.arange(cutoff, dtype=np.float64))
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: The suggestion to use in-place operations and preallocate arrays is a strong optimization strategy that reduces memory allocations and enhances performance, particularly in computationally intensive functions like _wigner_discretized_iterative.

    9
    Utilize NumPy's broadcasting to simplify computations and improve performance

    Consider using NumPy's broadcasting capabilities to simplify the computation of Wmat
    elements and improve performance in the _wigner_discretized_iterative function.

    mrmustard/physics/wigner.py [172-183]

    -for n in range(1, cutoff):
    -    Wmat[0, n] = (2.0 * grid * Wmat[0, n - 1]) / sqrt_n[n]
    -    W += 2 * np.real(rho[0, n] * Wmat[0, n])
    +# Compute Wmat[0] for all n at once
    +Wmat[0, 1:] = (2.0 * grid * Wmat[0, :-1].T / sqrt_n[1:]).T
    +W += 2 * np.real(np.sum(rho[0, 1:] * Wmat[0, 1:], axis=0))
     
     # Compute the remaining coefficients and accumulate the Wigner function.
     for m in range(1, cutoff):
    -    Wmat[1, m] = (2 * np.conj(grid) * Wmat[0, m] - sqrt_n[m] * Wmat[0, m - 1]) / sqrt_n[m]
    -    W += rho[m, m].real * Wmat[1, m].real
    +    Wmat[1, m:] = ((2 * np.conj(grid) * Wmat[0, m:].T - sqrt_n[m] * Wmat[0, m-1:].T) / sqrt_n[m:]).T
    +    W += rho[m, m].real * Wmat[1, m].real + 2 * np.real(np.sum(rho[m, m+1:] * Wmat[1, m+1:], axis=0))
     
    -    for n in range(m + 1, cutoff):
    -        Wmat[1, n] = (2 * grid * Wmat[1, n - 1] - sqrt_n[m] * Wmat[0, n - 1]) / sqrt_n[n]
    -        W += 2 * np.real(rho[m, n] * Wmat[1, n])
    -
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: This suggestion effectively uses NumPy's broadcasting capabilities to simplify and optimize the computation of Wmat elements, which can lead to performance improvements in the _wigner_discretized_iterative function.

    8
    ✅ Use tuples instead of lists for storing modes to improve performance and reduce memory usage
    Suggestion Impact:The commit changed the initialization of modes in the AdjointView class to use keys() directly, which returns a view object similar to a tuple, instead of converting them to lists. This aligns with the suggestion to use more efficient data structures.

    code diff:

    +            modes_in_ket=self._original.input.bra.keys(),
    +            modes_out_ket=self._original.output.bra.keys(),
    +            modes_in_bra=self._original.input.ket.keys(),
    +            modes_out_bra=self._original.output.ket.keys(),
             )
     
         def value(self, shape: tuple[int]):
    @@ -397,7 +405,12 @@
                 ComplexTensor: the unitary matrix in Fock representation
             """
             # converting the given shape into a shape for the original tensor
    -        shape_in_ket, shape_out_ket, shape_in_bra, shape_out_bra = self._original.unpack_shape(shape)
    +        (
    +            shape_in_ket,
    +            shape_out_ket,
    +            shape_in_bra,
    +            shape_out_bra,
    +        ) = self._original.unpack_shape(shape)
             shape_ret = shape_in_bra + shape_out_bra + shape_in_ket + shape_out_ket
     
             ret = math.conj(math.astensor(self._original.value(shape_ret)))
    @@ -413,10 +426,10 @@
             self._original = tensor
             super().__init__(
                 name=self._original.name,
    -            modes_in_ket=list(self._original.output.ket.keys()),
    -            modes_out_ket=list(self._original.input.ket.keys()),
    -            modes_in_bra=list(self._original.output.bra.keys()),
    -            modes_out_bra=list(self._original.input.bra.keys()),
    +            modes_in_ket=self._original.output.ket.keys(),
    +            modes_out_ket=self._original.input.ket.keys(),
    +            modes_in_bra=self._original.output.bra.keys(),
    +            modes_out_bra=self._original.input.bra.keys(),

    Consider using a more efficient data structure, such as a tuple or a frozen set, for
    storing the modes in the Tensor class initialization to improve performance and
    reduce memory usage.

    mrmustard/math/tensor_networks/tensors.py [186-191]

     def __init__(
         self,
         name: str,
    -    modes_in_ket: list[int] | None = None,
    -    modes_out_ket: list[int] | None = None,
    -    modes_in_bra: list[int] | None = None,
    -    modes_out_bra: list[int] | None = None,
    +    modes_in_ket: tuple[int, ...] | None = None,
    +    modes_out_ket: tuple[int, ...] | None = None,
    +    modes_in_bra: tuple[int, ...] | None = None,
    +    modes_out_bra: tuple[int, ...] | None = None,
     ):
         self._name = name
         self._update_modes(modes_in_ket, modes_out_ket, modes_in_bra, modes_out_bra)
     
     def _update_modes(
         self,
    -    modes_in_ket: list[int] | None = None,
    -    modes_out_ket: list[int] | None = None,
    -    modes_in_bra: list[int] | None = None,
    -    modes_out_bra: list[int] | None = None,
    +    modes_in_ket: tuple[int, ...] | None = None,
    +    modes_out_ket: tuple[int, ...] | None = None,
    +    modes_in_bra: tuple[int, ...] | None = None,
    +    modes_out_bra: tuple[int, ...] | None = None,
     ):
     
    -    self._modes_in_ket = modes_in_ket if modes_in_ket else []
    -    self._modes_out_ket = modes_out_ket if modes_out_ket else []
    -    self._modes_in_bra = modes_in_bra if modes_in_bra else []
    -    self._modes_out_bra = modes_out_bra if modes_out_bra else []
    +    self._modes_in_ket = modes_in_ket if modes_in_ket else ()
    +    self._modes_out_ket = modes_out_ket if modes_out_ket else ()
    +    self._modes_in_bra = modes_in_bra if modes_in_bra else ()
    +    self._modes_out_bra = modes_out_bra if modes_out_bra else ()
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: Changing from lists to tuples for storing modes can improve performance and reduce memory usage due to the immutability and fixed size of tuples, making this a reasonable suggestion for optimization. However, the impact might be less significant compared to other suggestions.

    7

    💡 Need additional feedback ? start a PR chat

    @apchytr apchytr added the no changelog Pull request does not require a CHANGELOG entry label Oct 31, 2024
    Copy link

    codecov bot commented Oct 31, 2024

    Codecov Report

    Attention: Patch coverage is 7.14286% with 13 lines in your changes missing coverage. Please review.

    Project coverage is 89.61%. Comparing base (3a1d8bb) to head (e55355f).

    Files with missing lines Patch % Lines
    mrmustard/physics/wigner.py 7.14% 13 Missing ⚠️
    Additional details and impacted files
    @@             Coverage Diff             @@
    ##           develop     #504      +/-   ##
    ===========================================
    - Coverage    89.77%   89.61%   -0.16%     
    ===========================================
      Files           92       92              
      Lines         6072     6087      +15     
    ===========================================
    + Hits          5451     5455       +4     
    - Misses         621      632      +11     
    Files with missing lines Coverage Δ
    mrmustard/math/tensor_networks/tensors.py 94.11% <ø> (ø)
    mrmustard/physics/wigner.py 50.00% <7.14%> (-50.00%) ⬇️

    ... and 1 file with indirect coverage changes


    Continue to review full report in Codecov by Sentry.

    Legend - Click here to learn more
    Δ = absolute <relative> (impact), ø = not affected, ? = missing data
    Powered by Codecov. Last update 3a1d8bb...e55355f. Read the comment docs.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    bug_fix enhancement New feature or request no changelog Pull request does not require a CHANGELOG entry Review effort [1-5]: 3
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    3 participants