Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3.7.0 #387

Merged
merged 25 commits into from
Sep 22, 2023
Merged

v3.7.0 #387

merged 25 commits into from
Sep 22, 2023

Conversation

TysonRayJones
Copy link
Member

@TysonRayJones TysonRayJones commented Sep 21, 2023

Overview

This release integrates a cuQuantum backend, optimises distributed communication, and improves the unit tests.

New features

  • QuEST gained a new backend which integrates cuQuantum and Thrust for optimised simulation on modern NVIDIA GPUs. This is compiled with cmake argument -DUSE_CUQUANTUM=1, as detailed in the compile doc. Unlike QuEST's other backends, this does require prior installation of cuQuantum, outlined here. This deployment mode should run much faster than QuEST's custom GPU backend, and will soon enable multi-GPU simulation. The entirety of QuEST's API is supported! 🎉

Other changes

  • QuEST's distributed communication has been optimised when exchanging states via many maximum-size messages, thanks to the work of Jakub Adamski as per this manuscript.
  • Functions like multiQubitUnitary() and mixMultiQubitKrausMap() have relaxed the precision of their unitarity and CPTP checks, so they will complain less about user matrices. Now, for example, a unitarity matrix U is deemed valid only if every element of U*dagger(U) has a Euclidean distance of at most REAL_EPS from its expected identity-matrix element.
  • Unit tests now check that their initial register states are as expected before testing an operator. This ensures that some tests do not accidentally pass when they should be failing (like when run with an incorrectly specified GPU compute capability) due to an unexpected all-zero initial state.
  • Unit tests now use an improved and numerically stable function for generating random unitaries and Kraus maps, so should trigger fewer precision errors and false test failures.

TysonRayJones and others added 25 commits August 17, 2023 18:43
Note that this forces our cuQuantum backend to require its users to have a stream-ordered memory pool compatible GPU (which seems fair enough)
for converting between QuEST's interface and backend types (like Complex, ComplexMatrixN, and bitmasks) and cuQuantum's
Added all operators (like unitaries, sub-diagonal gates) which can be directly mapped to a cuQuantum calls.

The cuQuantum calls are:
- custatevecApplyMatrix
- custatevecApplyPauliRotation
- custatevecSwapIndexBits
- custatevecApplyGeneralizedPermutationMatrix

It appears that the remainder of QuEST's operators (decoherence channels, full-state diagonals, and phase functions) will need bespoke kernels
Before each unit test, the initial state of the registers (assumed to be in the result of initDebugState) is now explicitly checked.

This prevents passing tests when initDebugState() itself is failing, and (for example) yielding an all-zero state which sneakily satisfies some unit tests.

This will likely noticeably increase the total unit-tests runtime, but will gaurantee tests visibly, instantly fail when (for example) the GPU configuration is wrong and produces all-zero states
since documentation is now generated by Github Actions and published on Github Pages without repo caching
which were unavailable in the API, and were not used internally nor in tests. Furthermore, some of them (`initStateOfSingleQubit`) did something *very* different to what its comments suggested - and inefficiently!
although we are missing imports to avoid git conflict:

# include <thrust/sequence.h>
# include <thrust/iterator/zip_iterator.h>
# include <thrust/for_each.h>
Added all decoherence channels which can be directly mapped (without unacceptable performance damage) to a cuQuantum call.

The cuQuantum calls are:
- custatevecApplyMatrix
- custatevecApplyGeneralizedPermutationMatrix

and are called with matrices (some, diagonal) describing the channel superoperators.

The remaining decoherence channels require linearly combining device vectors (may use Thrust), bespoke GPU kernels, or a clever decomposition of the channel (e.g. 2 qubit depolarising) into a sequence of cuStateVec calls
Changed several operators represented by diagonal matrices but previously effected as one-qubit general matrices, to instead be effected as diagonals (duh)
integrated a new cuQuantum and Thrust GPU backend
Previously, an ad-hoc measure of distance from unitarity (or CPTP) was used.

Now, a unitarity U is deemed valid only if every element of U*dagger(U) has a Euclidean distance of at most REAL_EPS from the corresponding Identity matrix element.

A similiar scheme for CPTP Kraus channels is used.

This effectively loosens the precision required of unitaries and Kraus maps to functions like multiQubitUnitary and multiQubitKrausMap
@TysonRayJones TysonRayJones merged commit d4f75f7 into master Sep 22, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant