nate's feedback round 2

unitaryfund · Oct 4, 2024 · 9d8e974 · 9d8e974
1 parent 3214d99
commit 9d8e974
Showing 1 changed file with 14 additions and 12 deletions.
diff --git a/docs/source/guide/lre-5-theory.md b/docs/source/guide/lre-5-theory.md
@@ -19,13 +19,13 @@ The user guide for LRE in Mitiq is currently under construction.
 # What is the theory behind LRE?
 
 Layerwise Richardson Extrapolation (LRE), an error mitigation technique, introduced in
-{cite}`Russo_2024_LRE` works by creating multiple noise-scaled variations of the input
+{cite}`Russo_2024_LRE` extends the ideas found in ZNE by allowing users to create multiple noise-scaled variations of the input
 circuit such that the noiseless expectation value is extrapolated from the execution of each
 noisy circuit.
 
 Similar to [ZNE](zne.md), this process works in two steps:
 
-- **Step 1:** Intentionally create multiple noise-scaled but logically equivalent circuits through unitary folding.
+- **Step 1:** Intentionally create multiple noise-scaled but logically equivalent circuits by scaling each layer or chunk of the input circuit through unitary folding.
 
 - **Step 2:** Extrapolate to the noiseless limit using multivariate richardson extrapolation.
 
@@ -54,32 +54,34 @@ $$
 \text{number of terms in the monomial basis with total degree } d = \binom{d + l - 1}{d}
 $$
 
-These monomial terms define the rows of the square sample matrix where $a_{i,j}=M_j(λ_i, d)$.
+These monomial terms define the rows of the square sample matrix as shown below:
 
 $$
 \mathbf{A}(\Lambda, d) = 
 \begin{bmatrix}
-    a_{1,1} & a_{1,2} & \cdots & a_{1,M} \\
-    a_{2,1} & a_{2,2} & \cdots & a_{2,M} \\
+    M_1(λ_1, d) & M_2(λ_1, d) & \cdots & M_N(λ_1, d) \\
+    M_1(λ_2, d) & M_2(λ_2, d) & \cdots & M_N(λ_2, d) \\
     \vdots & \vdots & \ddots & \vdots \\
-    a_{N,1} & a_{N,2} & \cdots & a_{N,M}
+    M_1(λ_N, d) & M_2(λ_N, d) & \cdots & M_N(λ_N, d)
 \end{bmatrix}
 $$
 
-Each monomial term in the sample matrix is evaluated using the values in the scale factor vectors. We aim to define the zero-noise limit as a linear combination of the noisy expectation values. Finding the coefficients in the linear combination becomes a problem solvable through a system of linear equations $Ac = z$ where $c$ is the coefficients vector, $z$ is the vector of expectation values and $\mathbf{A}$ is the sample matrix evaluated using the values in the scale factor vectors.
+Each monomial term in the sample matrix $\mathbf{A}$ is evaluated using the values in the scale factor vectors. In Step 2, we aim to define $O_{\mathrm{LRE}}$ as a linear combination of the noisy expectation values.
+
+Finding the coefficients in the linear combination becomes a problem solvable through a system of linear equations $\mathbf{A} c = z$ where $c$ is the coefficients vector $(\eta_1, \eta_2, \ldots, \eta_N)^T$, $z$ is the vector of the noisy expectation values and $\mathbf{A}$ is the sample matrix evaluated using the values in the scale factor vectors.
 
 ## Step 2: Extrapolate to the noiseless limit
 
-Each noise scaled circuit $C_{λ_i}$ has an expectation value associated with it $\langle O(λ_i) \rangle$ such that we can define a vector of the noisy expectation values $z = (\langle O(λ_1) \rangle, \langle O(λ_2) \rangle, \ldots, \langle O(λ_M)\rangle)^T$. These have a coefficient of linear combination associated with them such that 
+Each noise scaled circuit $C_{λ_i}$ has an expectation value $\langle O(λ_i) \rangle$ associated with it such that we can define a vector of the noisy expectation values $z = (\langle O(λ_1) \rangle, \langle O(λ_2) \rangle, \ldots, \langle O(λ_M)\rangle)^T$. These have a coefficient of linear combination associated with them as shown below: 
 
 $$
-O_{\mathrm{LRE}} = \sum_{i=1}^{M} \eta_i \langle O(\boldsymbol{\lambda}_i) \rangle.
+O_{\mathrm{LRE}} = \sum_{i=1}^{M} \eta_i \langle O(λ_i) \rangle.
 $$
 
-The system of linear equations is used to find the numerous $\eta_i$. As we only need to find the noiseless expectation value, we do not need to calculate the full vector of linear combination coefficients if we use the [Lagrange interpolation formula](https://files.eric.ed.gov/fulltext/EJ1231189.pdf). 
+The system of linear equations is used to find the numerous $\eta_i$ in vector $c$. As we only need to find the noiseless expectation value, we do not need to calculate the full vector of linear combination coefficients if we use the [Lagrange interpolation formula](https://files.eric.ed.gov/fulltext/EJ1231189.pdf). 
 
 $$
-O_{\rm LRE} = \sum_{i=1}^M \langle O (\boldsymbol{\lambda}_i)\rangle  \frac{\det \left(\mathbf{M}_i (\boldsymbol{0}) \right)}{\det \left(\mathbf{A}\right)}.
+O_{\rm LRE} = \sum_{i=1}^M \langle O (\boldsymbol{\lambda}_i)\rangle  \frac{\det \left(\mathbf{B}_i (\boldsymbol{0}) \right)}{\det \left(\mathbf{A}\right)}.
 $$
 
-To get the matrix $\mathbf{M}_i(\mathbf{0})$, replace the $i$-th row of the sample matrix $\mathbf{A}$ by $\mathbf{e}_1=(1, 0, \ldots, 0)^T$.
+To get the matrix $\mathbf{B}_i(\mathbf{0})$, replace the $i$-th row of the sample matrix $\mathbf{A}$ by $\mathbf{e}_1=(1, 0, \ldots, 0)^T$.