v3.7.0 #387

TysonRayJones · 2023-09-21T05:19:46Z

Overview

This release integrates a cuQuantum backend, optimises distributed communication, and improves the unit tests.

New features

QuEST gained a new backend which integrates cuQuantum and Thrust for optimised simulation on modern NVIDIA GPUs. This is compiled with cmake argument -DUSE_CUQUANTUM=1, as detailed in the compile doc. Unlike QuEST's other backends, this does require prior installation of cuQuantum, outlined here. This deployment mode should run much faster than QuEST's custom GPU backend, and will soon enable multi-GPU simulation. The entirety of QuEST's API is supported! 🎉

Other changes

QuEST's distributed communication has been optimised when exchanging states via many maximum-size messages, thanks to the work of Jakub Adamski as per this manuscript.
Functions like multiQubitUnitary() and mixMultiQubitKrausMap() have relaxed the precision of their unitarity and CPTP checks, so they will complain less about user matrices. Now, for example, a unitarity matrix U is deemed valid only if every element of U*dagger(U) has a Euclidean distance of at most REAL_EPS from its expected identity-matrix element.
Unit tests now check that their initial register states are as expected before testing an operator. This ensures that some tests do not accidentally pass when they should be failing (like when run with an incorrectly specified GPU compute capability) due to an unexpected all-zero initial state.
Unit tests now use an improved and numerically stable function for generating random unitaries and Kraus maps, so should trigger fewer precision errors and false test failures.

Note that this forces our cuQuantum backend to require its users to have a stream-ordered memory pool compatible GPU (which seems fair enough)

for converting between QuEST's interface and backend types (like Complex, ComplexMatrixN, and bitmasks) and cuQuantum's

Added all operators (like unitaries, sub-diagonal gates) which can be directly mapped to a cuQuantum calls. The cuQuantum calls are: - custatevecApplyMatrix - custatevecApplyPauliRotation - custatevecSwapIndexBits - custatevecApplyGeneralizedPermutationMatrix It appears that the remainder of QuEST's operators (decoherence channels, full-state diagonals, and phase functions) will need bespoke kernels

Before each unit test, the initial state of the registers (assumed to be in the result of initDebugState) is now explicitly checked. This prevents passing tests when initDebugState() itself is failing, and (for example) yielding an all-zero state which sneakily satisfies some unit tests. This will likely noticeably increase the total unit-tests runtime, but will gaurantee tests visibly, instantly fail when (for example) the GPU configuration is wrong and produces all-zero states

since documentation is now generated by Github Actions and published on Github Pages without repo caching

which were unavailable in the API, and were not used internally nor in tests. Furthermore, some of them (`initStateOfSingleQubit`) did something *very* different to what its comments suggested - and inefficiently!

although we are missing imports to avoid git conflict: # include <thrust/sequence.h> # include <thrust/iterator/zip_iterator.h> # include <thrust/for_each.h>

Added all decoherence channels which can be directly mapped (without unacceptable performance damage) to a cuQuantum call. The cuQuantum calls are: - custatevecApplyMatrix - custatevecApplyGeneralizedPermutationMatrix and are called with matrices (some, diagonal) describing the channel superoperators. The remaining decoherence channels require linearly combining device vectors (may use Thrust), bespoke GPU kernels, or a clever decomposition of the channel (e.g. 2 qubit depolarising) into a sequence of cuStateVec calls

Changed several operators represented by diagonal matrices but previously effected as one-qubit general matrices, to instead be effected as diagonals (duh)

integrated a new cuQuantum and Thrust GPU backend

for PR #380

Previously, an ad-hoc measure of distance from unitarity (or CPTP) was used. Now, a unitarity U is deemed valid only if every element of U*dagger(U) has a Euclidean distance of at most REAL_EPS from the corresponding Identity matrix element. A similiar scheme for CPTP Kraus channels is used. This effectively loosens the precision required of unitaries and Kraus maps to functions like multiQubitUnitary and multiQubitKrausMap

TysonRayJones and others added 25 commits August 17, 2023 18:43

scaffolded cuQuantum build

a23f518

prevent CI on working branches (#371)

9e2f3ce

proposing cuQuantum memory design

40f936f

setup automatic workspace memory

240d5ab

Note that this forces our cuQuantum backend to require its users to have a stream-ordered memory pool compatible GPU (which seems fair enough)

added type adapters

ec08912

for converting between QuEST's interface and backend types (like Complex, ComplexMatrixN, and bitmasks) and cuQuantum's

added Thrust

7b7c197

removed obsolete doc folder

d34dc21

since documentation is now generated by Github Actions and published on Github Pages without repo caching

removed defunct debugging routines

2c465c2

which were unavailable in the API, and were not used internally nor in tests. Furthermore, some of them (`initStateOfSingleQubit`) did something *very* different to what its comments suggested - and inefficiently!

added state initialisers

564851e

although we are missing imports to avoid git conflict: # include <thrust/sequence.h> # include <thrust/iterator/zip_iterator.h> # include <thrust/for_each.h>

added bespoke decoherence

d616481

optimised diagonal operators

b047972

Changed several operators represented by diagonal matrices but previously effected as one-qubit general matrices, to instead be effected as diagonals (duh)

added calculations

14b0a59

added phase functions

7e2362a

fixed automatic workspace memory

f7c6ee6

moved reportStateToScreen to common

c8d9d77

added cuQuantum to doc

f7515b0

Merge pull request #386 from QuEST-Kit/cuquantum

c63a0f8

integrated a new cuQuantum and Thrust GPU backend

Made MPI calls in exchangeStateVectors non-blocking

aa5ccd1

added Jakub Adamski to authorlist

8415f64

for PR #380

improved test getRandomUnitary

d3022a6

adding random unitary failsafe

3d8b6b7

TysonRayJones merged commit d4f75f7 into master Sep 22, 2023
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v3.7.0 #387

v3.7.0 #387

TysonRayJones commented Sep 21, 2023 •

edited

Loading

v3.7.0 #387

v3.7.0 #387

Conversation

TysonRayJones commented Sep 21, 2023 • edited Loading

Overview

New features

Other changes

TysonRayJones commented Sep 21, 2023 •

edited

Loading