From 88a5a2b0e3a754c59096275677388b87f5fb19d4 Mon Sep 17 00:00:00 2001 From: Phil Solimine <15682144+doctor-phil@users.noreply.github.com> Date: Tue, 11 Oct 2022 09:26:10 -0800 Subject: [PATCH 1/3] Update applied_linalg.md --- lectures/scientific/applied_linalg.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lectures/scientific/applied_linalg.md b/lectures/scientific/applied_linalg.md index 44e1e2fa..e039b4c3 100644 --- a/lectures/scientific/applied_linalg.md +++ b/lectures/scientific/applied_linalg.md @@ -704,8 +704,8 @@ plot_simulation(x0, A, 5000) The convergence of this system is a property determined by the matrix $A$. -The long-run distribution of employed and unemployed workers is equal to the [left-eigenvector](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) -of $A'$, corresponding to the eigenvalue equal to 1. +The long-run distribution of employed and unemployed workers is equal to the largest [eigenvector](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) +of $A'$, corresponding to the eigenvalue equal to 1. An eigenvalue of $A'$ is also known as a "left-eigenvector" of A. Let's have numpy compute the eigenvalues and eigenvectors and compare the results to our simulated results above: From ac5c12d958f2c6c06c0cfa051a3f4c30f06f0921 Mon Sep 17 00:00:00 2001 From: Phil <> Date: Tue, 11 Oct 2022 10:33:20 -0700 Subject: [PATCH 2/3] Update LA lecture --- lectures/scientific/applied_linalg.md | 1580 ++++++++++++------------- lectures/scientific/index.md | 90 +- lectures/scientific/numpy_arrays.md | 1102 ++++++++--------- lectures/scientific/optimization.md | 928 +++++++-------- lectures/scientific/plotting.md | 412 +++---- lectures/scientific/randomness.md | 1282 ++++++++++---------- 6 files changed, 2697 insertions(+), 2697 deletions(-) diff --git a/lectures/scientific/applied_linalg.md b/lectures/scientific/applied_linalg.md index e039b4c3..a87e2d75 100644 --- a/lectures/scientific/applied_linalg.md +++ b/lectures/scientific/applied_linalg.md @@ -1,790 +1,790 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# {index}`Applied Linear Algebra ` - -**Prerequisites** - -- {doc}`Introduction to Numpy ` - -**Outcomes** - -- Refresh some important linear algebra concepts -- Apply concepts to understanding unemployment and pricing portfolios -- Use `numpy` to do linear algebra operations - - -```{literalinclude} ../_static/colab_light.raw -``` - -```{code-cell} python -# import numpy to prepare for code below -import numpy as np -import matplotlib.pyplot as plt - -%matplotlib inline -``` - -## Vectors and Matrices - -### Vectors - -A (N-element) vector is $N$ numbers stored together. - -We typically write a vector as $x = \begin{bmatrix} x_1 \\ x_2 \\ \dots \\ x_N \end{bmatrix}$. - -In numpy terms, a vector is a 1-dimensional array. - -We often think of 2-element vectors as directional lines in the XY axes. - -This image, from the [QuantEcon Python lecture](https://python.quantecon.org/linear_algebra.html) -is an example of what this might look like for the vectors `(-4, 3.5)`, `(-3, 3)`, and `(2, 4)`. - -```{figure} ../_static/vector.png -:alt: vector.png -``` - -In a previous lecture, we saw some types of operations that can be done on -vectors, such as - -```{code-cell} python -x = np.array([1, 2, 3]) -y = np.array([4, 5, 6]) -``` - -**Element-wise operations**: Let $z = x ? y$ for some operation $?$, one of -the standard *binary* operations ($+, -, \times, \div$). Then we can write -$z = \begin{bmatrix} x_1 ? y_1 & x_2 ? y_2 \end{bmatrix}$. Element-wise operations require -that $x$ and $y$ have the same size. - -```{code-cell} python -print("Element-wise Addition", x + y) -print("Element-wise Subtraction", x - y) -print("Element-wise Multiplication", x * y) -print("Element-wise Division", x / y) -``` - -**Scalar operations**: Let $w = a ? x$ for some operation $?$, one of the -standard *binary* operations ($+, -, \times, \div$). Then we can write -$w = \begin{bmatrix} a ? x_1 & a ? x_2 \end{bmatrix}$. - -```{code-cell} python -print("Scalar Addition", 3 + x) -print("Scalar Subtraction", 3 - x) -print("Scalar Multiplication", 3 * x) -print("Scalar Division", 3 / x) -``` - -Another operation very frequently used in data science is the **dot product**. - -The dot between $x$ and $y$ is written $x \cdot y$ and is -equal to $\sum_{i=1}^N x_i y_i$. - -```{code-cell} python -print("Dot product", np.dot(x, y)) -``` - -We can also use `@` to denote dot products (and matrix multiplication which we'll see soon!). - -```{code-cell} python -print("Dot product with @", x @ y) -``` - -````{admonition} Exercise -:name: dir3-3-1 - -See exercise 1 in the {ref}`exercise list `. -```` - -```{code-cell} python ---- -tags: [hide-output] ---- -nA = 100 -nB = 50 -nassets = np.array([nA, nB]) - -i = 0.05 -durationA = 6 -durationB = 4 - -# Do your computations here - -# Compute price - -# uncomment below to see a message! -# if condition: -# print("Alice can retire") -# else: -# print("Alice cannot retire yet") -``` - -### Matrices - -An $N \times M$ matrix can be thought of as a collection of M -N-element vectors stacked side-by-side as columns. - -We write a matrix as - -$$ -\begin{bmatrix} x_{11} & x_{12} & \dots & x_{1M} \\ - x_{21} & \dots & \dots & x_{2M} \\ - \vdots & \vdots & \vdots & \vdots \\ - x_{N1} & x_{N2} & \dots & x_{NM} -\end{bmatrix} -$$ - -In numpy terms, a matrix is a 2-dimensional array. - -We can create a matrix by passing a list of lists to the `np.array` function. - -```{code-cell} python -x = np.array([[1, 2, 3], [4, 5, 6]]) -y = np.ones((2, 3)) -z = np.array([[1, 2], [3, 4], [5, 6]]) -``` - -We can perform element-wise and scalar operations as we did with vectors. In fact, we can do -these two operations on arrays of any dimension. - -```{code-cell} python -print("Element-wise Addition\n", x + y) -print("Element-wise Subtraction\n", x - y) -print("Element-wise Multiplication\n", x * y) -print("Element-wise Division\n", x / y) - -print("Scalar Addition\n", 3 + x) -print("Scalar Subtraction\n", 3 - x) -print("Scalar Multiplication\n", 3 * x) -print("Scalar Division\n", 3 / x) -``` - -Similar to how we combine vectors with a dot product, matrices can do what we'll call *matrix -multiplication*. - -Matrix multiplication is effectively a generalization of dot products. - -**Matrix multiplication**: Let $v = x \cdot y$ then we can write -$v_{ij} = \sum_{k=1}^N x_{ik} y_{kj}$ where $x_{ij}$ is notation that denotes the -element found in the ith row and jth column of the matrix $x$. - -The image below from [Wikipedia](https://commons.wikimedia.org/wiki/File:Matrix_multiplication_diagram.svg), -by Bilou, shows how matrix multiplication simplifies to a series of dot products: - -```{figure} ../_static/mat_mult_wiki_bilou.png -:alt: matmult.png -``` - -After looking at the math and image above, you might have realized that matrix -multiplication requires very specific matrix shapes! - -For two matrices $x, y$ to be multiplied, $x$ -must have the same number of columns as $y$ has rows. - -Formally, we require that for some integer numbers, $M, N,$ and $K$ -that if $x$ is $N \times M$ then $y$ must be $M \times -K$. - -If we think of a vector as a $1 \times M$ or $M \times 1$ matrix, we can even do -matrix multiplication between a matrix and a vector! - -Let's see some examples of this. - -```{code-cell} python -x1 = np.reshape(np.arange(6), (3, 2)) -x2 = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]) -x3 = np.array([[2, 5, 2], [1, 2, 1]]) -x4 = np.ones((2, 3)) - -y1 = np.array([1, 2, 3]) -y2 = np.array([0.5, 0.5]) -``` - -Numpy allows us to do matrix multiplication in three ways. - -```{code-cell} python -print("Using the matmul function for two matrices") -print(np.matmul(x1, x4)) -print("Using the dot function for two matrices") -print(np.dot(x1, x4)) -print("Using @ for two matrices") -print(x1 @ x4) -``` - -```{code-cell} python -print("Using the matmul function for vec and mat") -print(np.matmul(y1, x1)) -print("Using the dot function for vec and mat") -print(np.dot(y1, x1)) -print("Using @ for vec and mat") -print(y1 @ x1) -``` - -Despite our options, we stick to using `@` because -it is simplest to read and write. - - -````{admonition} Exercise -:name: dir3-3-2 - -See exercise 2 in the {ref}`exercise list `. -```` - - -### Other Linear Algebra Concepts - -#### Transpose - -A matrix transpose is an operation that flips all elements of a matrix along the diagonal. - -More formally, the $(i, j)$ element of $x$ becomes the $(j, i)$ element of -$x^T$. - -In particular, let $x$ be given by - -$$ -x = \begin{bmatrix} 1 & 2 & 3 \\ - 4 & 5 & 6 \\ - 7 & 8 & 9 \\ - \end{bmatrix} -$$ - -then $x$ transpose, written as $x'$, is given by - -$$ -x = \begin{bmatrix} 1 & 4 & 7 \\ - 2 & 5 & 8 \\ - 3 & 6 & 9 \\ - \end{bmatrix} -$$ - -In Python, we do this by - -```{code-cell} python -x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) - -print("x transpose is") -print(x.transpose()) -``` - -#### Identity Matrix - -In linear algebra, one particular matrix acts very similarly to how 1 behaves for scalar numbers. - -This matrix is known as the *identity matrix* and is given by - -$$ -I = \begin{bmatrix} 1 & 0 & 0 & \dots & 0 \\ - 0 & 1 & 0 & \dots & 0 \\ - \vdots & \vdots & \ddots & \vdots & \vdots \\ - 0 & 0 & 0 & \dots & 1 - \end{bmatrix} -$$ - -As seen above, it has 1s on the diagonal and 0s everywhere else. - -When we multiply any matrix or vector by the identity matrix, we get the original matrix or vector -back! - -Let's see some examples. - -```{code-cell} python -I = np.eye(3) -x = np.reshape(np.arange(9), (3, 3)) -y = np.array([1, 2, 3]) - -print("I @ x", "\n", I @ x) -print("x @ I", "\n", x @ I) -print("I @ y", "\n", I @ y) -print("y @ I", "\n", y @ I) -``` - -#### Inverse - -If you recall, you learned in your primary education about solving equations for certain variables. - -For example, you might have been given the equation - -$$ -3x + 7 = 16 -$$ - -and then asked to solve for $x$. - -You probably did this by subtracting 7 and then dividing by 3. - -Now let's write an equation that contains matrices and vectors. - -$$ -\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 3 \\ 4 \end{bmatrix} -$$ - -How would we solve for $x = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}$? - -Unfortunately, there is no "matrix divide" operation that does the opposite of matrix multiplication. - -Instead, we first have to do what's known as finding the inverse. We must multiply both sides by this inverse to solve. - -Consider some matrix $A$. - -The inverse of $A$, given by $A^{-1}$, is a matrix such that $A A^{-1} = I$ -where $I$ is our identity matrix. - -Notice in our equation above, if we can find the inverse of -$\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}$ then we can multiply both sides by the inverse -to get - -$$ -\begin{align*} -\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1}\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1}\begin{bmatrix} 3 \\ 4 \end{bmatrix} \\ -I \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 4 \end{bmatrix} \\ - \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 4 \end{bmatrix} -\end{align*} -$$ - -Computing the inverse requires that a matrix be square and satisfy some other conditions -(non-singularity) that are beyond the scope of this lecture. - -We also skip the exact details of how this inverse is computed, but, if you are interested, -you can visit the -[QuantEcon Linear Algebra lecture](https://python.quantecon.org/linear_algebra.html) -for more details. - -We demonstrate how to compute the inverse with numpy below. - -```{code-cell} python -# This is a square (N x N) non-singular matrix -A = np.array([[1, 2, 0], [3, 1, 0], [0, 1, 2]]) - -print("This is A inverse") - -print(np.linalg.inv(A)) - -print("Check that A @ A inverse is I") -print(np.linalg.inv(A) @ A) -``` - -## Portfolios - -In {doc}`control flow <../python_fundamentals/control_flow>`, we learned to value a stream of payoffs from a single -asset. - -In this section, we generalize this to value a portfolio of multiple assets, or an asset -that has easily separable components. - -Vectors and inner products give us a convenient way to organize and calculate these payoffs. - -### Static Payoffs - -As an example, consider a portfolio with 4 units of asset A, 2.5 units of asset B, and 8 units of -asset C. - -At a particular point in time, the assets pay $3$/unit of asset A, $5$/unit of B, and -$1.10$/unit of C. - -First, calculate the value of this portfolio directly with a sum. - -```{code-cell} python -4.0 * 3.0 + 2.5 * 5.0 + 8 * 1.1 -``` - -We can make this more convenient and general by using arrays for accounting, and then sum then in a -loop. - -```{code-cell} python -import numpy as np -x = np.array([4.0, 2.5, 8.0]) # portfolio units -y = np.array([3.0, 5.0, 1.1]) # payoffs -n = len(x) -p = 0.0 -for i in range(n): # i.e. 0, 1, 2 - p = p + x[i] * y[i] - -p -``` - -The above would have worked with `x` and `y` as `list` rather than `np.array`. - -Note that the general pattern above is the sum. - -$$ -p = \sum_{i=0}^{n-1} x_i y_i = x \cdot y -$$ - -This is an inner product as implemented by the `np.dot` function - -```{code-cell} python -np.dot(x, y) -``` - -This approach allows us to simultaneously price different portfolios by stacking them in a matrix and using the dot product. - -```{code-cell} python -y = np.array([3.0, 5.0, 1.1]) # payoffs -x1 = np.array([4.0, 2.5, 8.0]) # portfolio 1 -x2 = np.array([2.0, 1.5, 0.0]) # portfolio 2 -X = np.array((x1, x2)) - -# calculate with inner products -p1 = np.dot(X[0,:], y) -p2 = np.dot(X[1,:], y) -print("Calculating separately") -print([p1, p2]) - -# or with a matrix multiplication -print("Calculating with matrices") -P = X @ y -print(P) -``` - -### NPV of a Portfolio - -If a set of assets has payoffs over time, we can calculate the NPV of that portfolio in a similar way to the calculation in -{ref}`npv `. - -First, consider an example with an asset with claims to multiple streams of payoffs which are easily -separated. - -You are considering purchasing an oilfield with 2 oil wells, named `A` and `B` where - -- Both oilfields have a finite lifetime of 20 years. -- In oilfield `A`, you can extract 5 units in the first year, and production in each subsequent year - decreases by $20\%$ of the previous year so that - $x^A_0 = 5, x^A_1 = 0.8 \times 5, x^A_2 = 0.8^2 \times 5, \ldots$ -- In oilfield `B`, you can extract 2 units in the first year, but production only drops by - $10\%$ each year (i.e. $x^B_0 = 2, x^B_1 = 0.9 \times 2, x^B_2 = 0.9^2 \times 2, \ldots$ -- Future cash flows are discounted at a rate of $r = 0.05$ each year. -- The price for oil in both wells are normalized as $p_A = p_B = 1$. - -These traits can be separated so that the price you would be willing to pay is the sum of the two, where -we define $\gamma_A = 0.8, \gamma_B = 0.9$. - -$$ -\begin{aligned} -V_A &= \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t p_A y^A_t = \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t (p_A \, x_{A0}\, \gamma_A^t)\\ -V_B &= \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t p_B y^B_t = \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t (p_B \, x_{B0}\, \gamma_B^t)\\ -V &= V_A + V_B -\end{aligned} -$$ - -Let's compute the value of each of these assets using the dot product. - -The first question to ask yourself is: "For which two vectors should I compute the dot product?" - -It turns out that this depends on which two vectors you'd like to create. - -One reasonable choice is presented in the code below. - -```{code-cell} python -# Depreciation of production rates -gamma_A = 0.80 -gamma_B = 0.90 - -# Interest rate discounting -r = 0.05 -discount = np.array([(1 / (1+r))**t for t in range(20)]) - -# Let's first create arrays that have the production of each oilfield -oil_A = 5 * np.array([gamma_A**t for t in range(20)]) -oil_B = 2 * np.array([gamma_B**t for t in range(20)]) -oilfields = np.array([oil_A, oil_B]) - -# Use matrix multiplication to get discounted sum of oilfield values and then sum -# the two values -Vs = oilfields @ discount - -print(f"The npv of oilfields is {Vs.sum()}") -``` - -Now consider the approximation where instead of the oilfields having a finite lifetime of 20 years, -we let them produce forever, i.e. $T = \infty$. - -With a little algebra, - -$$ -V_A = p_A \sum_{t=0}^{\infty}\left(\frac{1}{1 + r}\right)^t (x_{A0} \gamma_A^t) = x_{A0}\sum_{t=0}^{\infty}\left(\frac{\gamma_A}{1 + r}\right)^t -$$ - -And, using the infinite sum formula from {doc}`Control Flow <../python_fundamentals/control_flow>` (i.e. $\sum_{t=0}^{\infty}\beta^t = (1 - \beta)^{-1}$) - -$$ -= \frac{p_A x_{A0}}{1 - \left(\gamma_A\frac{1}{1 + r} \right)} -$$ - -The $V_B$ is defined symmetrically. - -How different is this infinite horizon approximation from the $T = 20$ version, and why? - -Now, let's compute the $T = \infty$ version of the net present value and make a graph to help -us see how many periods are needed to approach the infinite horizon value. - -```{code-cell} python -# Depreciation of production rates -gamma_A = 0.80 -gamma_B = 0.90 - -# Interest rate discounting -r = 0.05 - - -def infhor_NPV_oilfield(starting_output, gamma, r): - beta = gamma / (1 + r) - return starting_output / (1 - beta) - - -def compute_NPV_oilfield(starting_output, gamma, r, T): - outputs = starting_output * np.array([gamma**t for t in range(T)]) - discount = np.array([(1 / (1+r))**t for t in range(T)]) - - npv = np.dot(outputs, discount) - - return npv - -Ts = np.arange(2, 75) - -NPVs_A = np.array([compute_NPV_oilfield(5, gamma_A, r, t) for t in Ts]) -NPVs_B = np.array([compute_NPV_oilfield(2, gamma_B, r, t) for t in Ts]) - -NPVs_T = NPVs_A + NPVs_B -NPV_oo = infhor_NPV_oilfield(5, gamma_A, r) + infhor_NPV_oilfield(2, gamma_B, r) - -fig, ax = plt.subplots() - -ax.set_title("NPV with Varying T") -ax.set_ylabel("NPV") - -ax.plot(Ts, NPVs_A + NPVs_B) -ax.hlines(NPV_oo, Ts[0], Ts[-1], color="k", linestyle="--") # Plot infinite horizon value - -ax.spines["right"].set_visible(False) -ax.spines["top"].set_visible(False) -``` - -It is also worth noting that the computation of the infinite horizon net present value can be -simplified even further by using matrix multiplication. That is, the formula given above is -equivalent to - -$$ -V = \begin{bmatrix}p_A & p_B \end{bmatrix} \cdot \sum_{t=0}^{\infty} \left(\left(\frac{1}{1 + r}\right)^t \begin{bmatrix} \gamma_A & 0 \\ 0 & \gamma_B \end{bmatrix}^t \cdot x_0\right) -$$ - -and where $x_0 = \begin{bmatrix} x_{A0} \\ x_{B0} \end{bmatrix}$. - -We recognize that this equation is of the form - -$$ -V = G \sum_{t=0}^{\infty} \left(\frac{1}{1 + r}\right)^t A^t x_0 -$$ - -Without proof, and given important assumptions on $\frac{1}{1 + r}$ and $A$, this -equation reduces to - -```{math} -:label: eq_deterministic_asset_pricing - -V = G \left(I - \frac{1}{1+r} A\right)^{-1} x_0 -``` - -Using the matrix inverse, where `I` is the identity matrix. - -```{code-cell} python -p_A = 1.0 -p_B = 1.0 -G = np.array([p_A, p_B]) - -r = 0.05 -beta = 1 / (1 + r) - -gamma_A = 0.80 -gamma_B = 0.90 -A = np.array([[gamma_A, 0], [0, gamma_B]]) - -x_0 = np.array([5, 2]) - -# Compute with matrix formula -NPV_mf = G @ np.linalg.inv(np.eye(2) - beta*A) @ x_0 - -print(NPV_mf) -``` - -Note: While our matrix above was very simple, this approach works for much more -complicated `A` matrices as long as we can write $x_t$ using $A$ and $x_0$ as -$x_t = A^t x_0$ (For an advanced description of this topic, adding randomness, read about -linear state-space models with Python ). - -### Unemployment Dynamics - -Consider an economy where in any given year, $\alpha = 5\%$ of workers lose their jobs and -$\phi = 10\%$ of unemployed workers find jobs. - -Define the vector $x_0 = \begin{bmatrix} 900,000 & 100,000 \end{bmatrix}$ as the number of -employed and unemployed workers (respectively) at time $0$ in the economy. - -Our goal is to determine the dynamics of unemployment in this economy. - -First, let's define the matrix. - -$$ -A = \begin{bmatrix} 1 - \alpha & \alpha \\ \phi & 1 - \phi \end{bmatrix} -$$ - -Note that with this definition, we can describe the evolution of employment and unemployment -from $x_0$ to $x_1$ using linear algebra. - -$$ -x_1 = \begin{bmatrix} (1 - \alpha) 900,000 + \phi 100,000 \\ \alpha 900,000 + (1-\phi) 100,000\end{bmatrix} = A' x_0 -$$ - -However, since the transitions do not change over time, we can use this to describe the evolution -from any arbitrary time $t$, so that - -$$ -x_{t+1} = A' x_t -$$ - -Let's code up a python function that will let us track the evolution of unemployment over time. - -```{code-cell} python -phi = 0.1 -alpha = 0.05 - -x0 = np.array([900_000, 100_000]) - -A = np.array([[1-alpha, alpha], [phi, 1-phi]]) - -def simulate(x0, A, T=10): - """ - Simulate the dynamics of unemployment for T periods starting from x0 - and using values of A for probabilities of moving between employment - and unemployment - """ - nX = x0.shape[0] - out = np.zeros((T, nX)) - out[0, :] = x0 - - for t in range(1, T): - out[t, :] = A.T @ out[t-1, :] - - return out -``` - -Let's use this function to plot unemployment and employment levels for 10 periods. - -```{code-cell} python -def plot_simulation(x0, A, T=100): - X = simulate(x0, A, T) - fig, ax = plt.subplots() - ax.plot(X[:, 0]) - ax.plot(X[:, 1]) - ax.set_xlabel("t") - ax.legend(["Employed", "Unemployed"]) - return ax - -plot_simulation(x0, A, 50) -``` - -Notice that the levels of unemployed an employed workers seem to be heading to constant numbers. - -We refer to this phenomenon as *convergence* because the values appear to converge to a constant -number. - -Let's check that the values are permanently converging. - -```{code-cell} python -plot_simulation(x0, A, 5000) -``` - -The convergence of this system is a property determined by the matrix $A$. - -The long-run distribution of employed and unemployed workers is equal to the largest [eigenvector](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) -of $A'$, corresponding to the eigenvalue equal to 1. An eigenvalue of $A'$ is also known as a "left-eigenvector" of A. - -Let's have numpy compute the eigenvalues and eigenvectors and compare the results to our simulated results above: - -```{code-cell} python -eigvals, eigvecs = np.linalg.eig(A.T) -for i in range(len(eigvals)): - if eigvals[i] == 1: - which_eig = i - break - -print(f"We are looking for eigenvalue {which_eig}") -``` - -Now let's look at the corresponding eigenvector: - -```{code-cell} python -dist = eigvecs[:, which_eig] - -# need to divide by sum so it adds to 1 -dist /= dist.sum() - -print(f"The distribution of workers is given by {dist}") -``` - - -````{admonition} Exercise -:name: dir3-3-3 - -See exercise 3 in the {ref}`exercise list `. -```` - -(ex3-3)= -## Exercises - -### Exercise 1 - -Alice is a stock broker who owns two types of assets: A and B. She owns 100 -units of asset A and 50 units of asset B. The current interest rate is 5%. -Each of the A assets have a remaining duration of 6 years and pay -\$1500 each year, while each of the B assets have a remaining duration -of 4 years and pay \$500 each year. Alice would like to retire if she -can sell her assets for more than \$500,000. Use vector addition, scalar -multiplication, and dot products to determine whether she can retire. - -({ref}`back to text `) - -### Exercise 2 - -Which of the following operations will work and which will -create errors because of size issues? - -Test out your intuitions in the code cell below - -```{code-block} python -x1 @ x2 -x2 @ x1 -x2 @ x3 -x3 @ x2 -x1 @ x3 -x4 @ y1 -x4 @ y2 -y1 @ x4 -y2 @ x4 -``` - -```{code-cell} python -# testing area -``` - -({ref}`back to text `) - -### Exercise 3 - -Compare the distribution above to the final values of a long simulation. - -If you multiply the distribution by 1,000,000 (the number of workers), do you get (roughly) the same number as the simulation? - -```{code-cell} python -# your code here -``` - -({ref}`back to text `) +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# {index}`Applied Linear Algebra ` + +**Prerequisites** + +- {doc}`Introduction to Numpy ` + +**Outcomes** + +- Refresh some important linear algebra concepts +- Apply concepts to understanding unemployment and pricing portfolios +- Use `numpy` to do linear algebra operations + + +```{literalinclude} ../_static/colab_light.raw +``` + +```{code-cell} python +# import numpy to prepare for code below +import numpy as np +import matplotlib.pyplot as plt + +%matplotlib inline +``` + +## Vectors and Matrices + +### Vectors + +A (N-element) vector is $N$ numbers stored together. + +We typically write a vector as $x = \begin{bmatrix} x_1 \\ x_2 \\ \dots \\ x_N \end{bmatrix}$. + +In numpy terms, a vector is a 1-dimensional array. + +We often think of 2-element vectors as directional lines in the XY axes. + +This image, from the [QuantEcon Python lecture](https://python.quantecon.org/linear_algebra.html) +is an example of what this might look like for the vectors `(-4, 3.5)`, `(-3, 3)`, and `(2, 4)`. + +```{figure} ../_static/vector.png +:alt: vector.png +``` + +In a previous lecture, we saw some types of operations that can be done on +vectors, such as + +```{code-cell} python +x = np.array([1, 2, 3]) +y = np.array([4, 5, 6]) +``` + +**Element-wise operations**: Let $z = x ? y$ for some operation $?$, one of +the standard *binary* operations ($+, -, \times, \div$). Then we can write +$z = \begin{bmatrix} x_1 ? y_1 & x_2 ? y_2 \end{bmatrix}$. Element-wise operations require +that $x$ and $y$ have the same size. + +```{code-cell} python +print("Element-wise Addition", x + y) +print("Element-wise Subtraction", x - y) +print("Element-wise Multiplication", x * y) +print("Element-wise Division", x / y) +``` + +**Scalar operations**: Let $w = a ? x$ for some operation $?$, one of the +standard *binary* operations ($+, -, \times, \div$). Then we can write +$w = \begin{bmatrix} a ? x_1 & a ? x_2 \end{bmatrix}$. + +```{code-cell} python +print("Scalar Addition", 3 + x) +print("Scalar Subtraction", 3 - x) +print("Scalar Multiplication", 3 * x) +print("Scalar Division", 3 / x) +``` + +Another operation very frequently used in data science is the **dot product**. + +The dot between $x$ and $y$ is written $x \cdot y$ and is +equal to $\sum_{i=1}^N x_i y_i$. + +```{code-cell} python +print("Dot product", np.dot(x, y)) +``` + +We can also use `@` to denote dot products (and matrix multiplication which we'll see soon!). + +```{code-cell} python +print("Dot product with @", x @ y) +``` + +````{admonition} Exercise +:name: dir3-3-1 + +See exercise 1 in the {ref}`exercise list `. +```` + +```{code-cell} python +--- +tags: [hide-output] +--- +nA = 100 +nB = 50 +nassets = np.array([nA, nB]) + +i = 0.05 +durationA = 6 +durationB = 4 + +# Do your computations here + +# Compute price + +# uncomment below to see a message! +# if condition: +# print("Alice can retire") +# else: +# print("Alice cannot retire yet") +``` + +### Matrices + +An $N \times M$ matrix can be thought of as a collection of M +N-element vectors stacked side-by-side as columns. + +We write a matrix as + +$$ +\begin{bmatrix} x_{11} & x_{12} & \dots & x_{1M} \\ + x_{21} & \dots & \dots & x_{2M} \\ + \vdots & \vdots & \vdots & \vdots \\ + x_{N1} & x_{N2} & \dots & x_{NM} +\end{bmatrix} +$$ + +In numpy terms, a matrix is a 2-dimensional array. + +We can create a matrix by passing a list of lists to the `np.array` function. + +```{code-cell} python +x = np.array([[1, 2, 3], [4, 5, 6]]) +y = np.ones((2, 3)) +z = np.array([[1, 2], [3, 4], [5, 6]]) +``` + +We can perform element-wise and scalar operations as we did with vectors. In fact, we can do +these two operations on arrays of any dimension. + +```{code-cell} python +print("Element-wise Addition\n", x + y) +print("Element-wise Subtraction\n", x - y) +print("Element-wise Multiplication\n", x * y) +print("Element-wise Division\n", x / y) + +print("Scalar Addition\n", 3 + x) +print("Scalar Subtraction\n", 3 - x) +print("Scalar Multiplication\n", 3 * x) +print("Scalar Division\n", 3 / x) +``` + +Similar to how we combine vectors with a dot product, matrices can do what we'll call *matrix +multiplication*. + +Matrix multiplication is effectively a generalization of dot products. + +**Matrix multiplication**: Let $v = x \cdot y$ then we can write +$v_{ij} = \sum_{k=1}^N x_{ik} y_{kj}$ where $x_{ij}$ is notation that denotes the +element found in the ith row and jth column of the matrix $x$. + +The image below from [Wikipedia](https://commons.wikimedia.org/wiki/File:Matrix_multiplication_diagram.svg), +by Bilou, shows how matrix multiplication simplifies to a series of dot products: + +```{figure} ../_static/mat_mult_wiki_bilou.png +:alt: matmult.png +``` + +After looking at the math and image above, you might have realized that matrix +multiplication requires very specific matrix shapes! + +For two matrices $x, y$ to be multiplied, $x$ +must have the same number of columns as $y$ has rows. + +Formally, we require that for some integer numbers, $M, N,$ and $K$ +that if $x$ is $N \times M$ then $y$ must be $M \times +K$. + +If we think of a vector as a $1 \times M$ or $M \times 1$ matrix, we can even do +matrix multiplication between a matrix and a vector! + +Let's see some examples of this. + +```{code-cell} python +x1 = np.reshape(np.arange(6), (3, 2)) +x2 = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]) +x3 = np.array([[2, 5, 2], [1, 2, 1]]) +x4 = np.ones((2, 3)) + +y1 = np.array([1, 2, 3]) +y2 = np.array([0.5, 0.5]) +``` + +Numpy allows us to do matrix multiplication in three ways. + +```{code-cell} python +print("Using the matmul function for two matrices") +print(np.matmul(x1, x4)) +print("Using the dot function for two matrices") +print(np.dot(x1, x4)) +print("Using @ for two matrices") +print(x1 @ x4) +``` + +```{code-cell} python +print("Using the matmul function for vec and mat") +print(np.matmul(y1, x1)) +print("Using the dot function for vec and mat") +print(np.dot(y1, x1)) +print("Using @ for vec and mat") +print(y1 @ x1) +``` + +Despite our options, we stick to using `@` because +it is simplest to read and write. + + +````{admonition} Exercise +:name: dir3-3-2 + +See exercise 2 in the {ref}`exercise list `. +```` + + +### Other Linear Algebra Concepts + +#### Transpose + +A matrix transpose is an operation that flips all elements of a matrix along the diagonal. + +More formally, the $(i, j)$ element of $x$ becomes the $(j, i)$ element of +$x^T$. + +In particular, let $x$ be given by + +$$ +x = \begin{bmatrix} 1 & 2 & 3 \\ + 4 & 5 & 6 \\ + 7 & 8 & 9 \\ + \end{bmatrix} +$$ + +then $x$ transpose, written as $x'$, is given by + +$$ +x = \begin{bmatrix} 1 & 4 & 7 \\ + 2 & 5 & 8 \\ + 3 & 6 & 9 \\ + \end{bmatrix} +$$ + +In Python, we do this by + +```{code-cell} python +x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) + +print("x transpose is") +print(x.transpose()) +``` + +#### Identity Matrix + +In linear algebra, one particular matrix acts very similarly to how 1 behaves for scalar numbers. + +This matrix is known as the *identity matrix* and is given by + +$$ +I = \begin{bmatrix} 1 & 0 & 0 & \dots & 0 \\ + 0 & 1 & 0 & \dots & 0 \\ + \vdots & \vdots & \ddots & \vdots & \vdots \\ + 0 & 0 & 0 & \dots & 1 + \end{bmatrix} +$$ + +As seen above, it has 1s on the diagonal and 0s everywhere else. + +When we multiply any matrix or vector by the identity matrix, we get the original matrix or vector +back! + +Let's see some examples. + +```{code-cell} python +I = np.eye(3) +x = np.reshape(np.arange(9), (3, 3)) +y = np.array([1, 2, 3]) + +print("I @ x", "\n", I @ x) +print("x @ I", "\n", x @ I) +print("I @ y", "\n", I @ y) +print("y @ I", "\n", y @ I) +``` + +#### Inverse + +If you recall, you learned in your primary education about solving equations for certain variables. + +For example, you might have been given the equation + +$$ +3x + 7 = 16 +$$ + +and then asked to solve for $x$. + +You probably did this by subtracting 7 and then dividing by 3. + +Now let's write an equation that contains matrices and vectors. + +$$ +\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 3 \\ 4 \end{bmatrix} +$$ + +How would we solve for $x = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}$? + +Unfortunately, there is no "matrix divide" operation that does the opposite of matrix multiplication. + +Instead, we first have to do what's known as finding the inverse. We must multiply both sides by this inverse to solve. + +Consider some matrix $A$. + +The inverse of $A$, given by $A^{-1}$, is a matrix such that $A A^{-1} = I$ +where $I$ is our identity matrix. + +Notice in our equation above, if we can find the inverse of +$\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}$ then we can multiply both sides by the inverse +to get + +$$ +\begin{align*} +\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1}\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1}\begin{bmatrix} 3 \\ 4 \end{bmatrix} \\ +I \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 4 \end{bmatrix} \\ + \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 4 \end{bmatrix} +\end{align*} +$$ + +Computing the inverse requires that a matrix be square and satisfy some other conditions +(non-singularity) that are beyond the scope of this lecture. + +We also skip the exact details of how this inverse is computed, but, if you are interested, +you can visit the +[QuantEcon Linear Algebra lecture](https://python.quantecon.org/linear_algebra.html) +for more details. + +We demonstrate how to compute the inverse with numpy below. + +```{code-cell} python +# This is a square (N x N) non-singular matrix +A = np.array([[1, 2, 0], [3, 1, 0], [0, 1, 2]]) + +print("This is A inverse") + +print(np.linalg.inv(A)) + +print("Check that A @ A inverse is I") +print(np.linalg.inv(A) @ A) +``` + +## Portfolios + +In {doc}`control flow <../python_fundamentals/control_flow>`, we learned to value a stream of payoffs from a single +asset. + +In this section, we generalize this to value a portfolio of multiple assets, or an asset +that has easily separable components. + +Vectors and inner products give us a convenient way to organize and calculate these payoffs. + +### Static Payoffs + +As an example, consider a portfolio with 4 units of asset A, 2.5 units of asset B, and 8 units of +asset C. + +At a particular point in time, the assets pay $3$/unit of asset A, $5$/unit of B, and +$1.10$/unit of C. + +First, calculate the value of this portfolio directly with a sum. + +```{code-cell} python +4.0 * 3.0 + 2.5 * 5.0 + 8 * 1.1 +``` + +We can make this more convenient and general by using arrays for accounting, and then sum then in a +loop. + +```{code-cell} python +import numpy as np +x = np.array([4.0, 2.5, 8.0]) # portfolio units +y = np.array([3.0, 5.0, 1.1]) # payoffs +n = len(x) +p = 0.0 +for i in range(n): # i.e. 0, 1, 2 + p = p + x[i] * y[i] + +p +``` + +The above would have worked with `x` and `y` as `list` rather than `np.array`. + +Note that the general pattern above is the sum. + +$$ +p = \sum_{i=0}^{n-1} x_i y_i = x \cdot y +$$ + +This is an inner product as implemented by the `np.dot` function + +```{code-cell} python +np.dot(x, y) +``` + +This approach allows us to simultaneously price different portfolios by stacking them in a matrix and using the dot product. + +```{code-cell} python +y = np.array([3.0, 5.0, 1.1]) # payoffs +x1 = np.array([4.0, 2.5, 8.0]) # portfolio 1 +x2 = np.array([2.0, 1.5, 0.0]) # portfolio 2 +X = np.array((x1, x2)) + +# calculate with inner products +p1 = np.dot(X[0,:], y) +p2 = np.dot(X[1,:], y) +print("Calculating separately") +print([p1, p2]) + +# or with a matrix multiplication +print("Calculating with matrices") +P = X @ y +print(P) +``` + +### NPV of a Portfolio + +If a set of assets has payoffs over time, we can calculate the NPV of that portfolio in a similar way to the calculation in +{ref}`npv `. + +First, consider an example with an asset with claims to multiple streams of payoffs which are easily +separated. + +You are considering purchasing an oilfield with 2 oil wells, named `A` and `B` where + +- Both oilfields have a finite lifetime of 20 years. +- In oilfield `A`, you can extract 5 units in the first year, and production in each subsequent year + decreases by $20\%$ of the previous year so that + $x^A_0 = 5, x^A_1 = 0.8 \times 5, x^A_2 = 0.8^2 \times 5, \ldots$ +- In oilfield `B`, you can extract 2 units in the first year, but production only drops by + $10\%$ each year (i.e. $x^B_0 = 2, x^B_1 = 0.9 \times 2, x^B_2 = 0.9^2 \times 2, \ldots$ +- Future cash flows are discounted at a rate of $r = 0.05$ each year. +- The price for oil in both wells are normalized as $p_A = p_B = 1$. + +These traits can be separated so that the price you would be willing to pay is the sum of the two, where +we define $\gamma_A = 0.8, \gamma_B = 0.9$. + +$$ +\begin{aligned} +V_A &= \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t p_A y^A_t = \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t (p_A \, x_{A0}\, \gamma_A^t)\\ +V_B &= \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t p_B y^B_t = \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t (p_B \, x_{B0}\, \gamma_B^t)\\ +V &= V_A + V_B +\end{aligned} +$$ + +Let's compute the value of each of these assets using the dot product. + +The first question to ask yourself is: "For which two vectors should I compute the dot product?" + +It turns out that this depends on which two vectors you'd like to create. + +One reasonable choice is presented in the code below. + +```{code-cell} python +# Depreciation of production rates +gamma_A = 0.80 +gamma_B = 0.90 + +# Interest rate discounting +r = 0.05 +discount = np.array([(1 / (1+r))**t for t in range(20)]) + +# Let's first create arrays that have the production of each oilfield +oil_A = 5 * np.array([gamma_A**t for t in range(20)]) +oil_B = 2 * np.array([gamma_B**t for t in range(20)]) +oilfields = np.array([oil_A, oil_B]) + +# Use matrix multiplication to get discounted sum of oilfield values and then sum +# the two values +Vs = oilfields @ discount + +print(f"The npv of oilfields is {Vs.sum()}") +``` + +Now consider the approximation where instead of the oilfields having a finite lifetime of 20 years, +we let them produce forever, i.e. $T = \infty$. + +With a little algebra, + +$$ +V_A = p_A \sum_{t=0}^{\infty}\left(\frac{1}{1 + r}\right)^t (x_{A0} \gamma_A^t) = x_{A0}\sum_{t=0}^{\infty}\left(\frac{\gamma_A}{1 + r}\right)^t +$$ + +And, using the infinite sum formula from {doc}`Control Flow <../python_fundamentals/control_flow>` (i.e. $\sum_{t=0}^{\infty}\beta^t = (1 - \beta)^{-1}$) + +$$ += \frac{p_A x_{A0}}{1 - \left(\gamma_A\frac{1}{1 + r} \right)} +$$ + +The $V_B$ is defined symmetrically. + +How different is this infinite horizon approximation from the $T = 20$ version, and why? + +Now, let's compute the $T = \infty$ version of the net present value and make a graph to help +us see how many periods are needed to approach the infinite horizon value. + +```{code-cell} python +# Depreciation of production rates +gamma_A = 0.80 +gamma_B = 0.90 + +# Interest rate discounting +r = 0.05 + + +def infhor_NPV_oilfield(starting_output, gamma, r): + beta = gamma / (1 + r) + return starting_output / (1 - beta) + + +def compute_NPV_oilfield(starting_output, gamma, r, T): + outputs = starting_output * np.array([gamma**t for t in range(T)]) + discount = np.array([(1 / (1+r))**t for t in range(T)]) + + npv = np.dot(outputs, discount) + + return npv + +Ts = np.arange(2, 75) + +NPVs_A = np.array([compute_NPV_oilfield(5, gamma_A, r, t) for t in Ts]) +NPVs_B = np.array([compute_NPV_oilfield(2, gamma_B, r, t) for t in Ts]) + +NPVs_T = NPVs_A + NPVs_B +NPV_oo = infhor_NPV_oilfield(5, gamma_A, r) + infhor_NPV_oilfield(2, gamma_B, r) + +fig, ax = plt.subplots() + +ax.set_title("NPV with Varying T") +ax.set_ylabel("NPV") + +ax.plot(Ts, NPVs_A + NPVs_B) +ax.hlines(NPV_oo, Ts[0], Ts[-1], color="k", linestyle="--") # Plot infinite horizon value + +ax.spines["right"].set_visible(False) +ax.spines["top"].set_visible(False) +``` + +It is also worth noting that the computation of the infinite horizon net present value can be +simplified even further by using matrix multiplication. That is, the formula given above is +equivalent to + +$$ +V = \begin{bmatrix}p_A & p_B \end{bmatrix} \cdot \sum_{t=0}^{\infty} \left(\left(\frac{1}{1 + r}\right)^t \begin{bmatrix} \gamma_A & 0 \\ 0 & \gamma_B \end{bmatrix}^t \cdot x_0\right) +$$ + +and where $x_0 = \begin{bmatrix} x_{A0} \\ x_{B0} \end{bmatrix}$. + +We recognize that this equation is of the form + +$$ +V = G \sum_{t=0}^{\infty} \left(\frac{1}{1 + r}\right)^t A^t x_0 +$$ + +Without proof, and given important assumptions on $\frac{1}{1 + r}$ and $A$, this +equation reduces to + +```{math} +:label: eq_deterministic_asset_pricing + +V = G \left(I - \frac{1}{1+r} A\right)^{-1} x_0 +``` + +Using the matrix inverse, where `I` is the identity matrix. + +```{code-cell} python +p_A = 1.0 +p_B = 1.0 +G = np.array([p_A, p_B]) + +r = 0.05 +beta = 1 / (1 + r) + +gamma_A = 0.80 +gamma_B = 0.90 +A = np.array([[gamma_A, 0], [0, gamma_B]]) + +x_0 = np.array([5, 2]) + +# Compute with matrix formula +NPV_mf = G @ np.linalg.inv(np.eye(2) - beta*A) @ x_0 + +print(NPV_mf) +``` + +Note: While our matrix above was very simple, this approach works for much more +complicated `A` matrices as long as we can write $x_t$ using $A$ and $x_0$ as +$x_t = A^t x_0$ (For an advanced description of this topic, adding randomness, read about +linear state-space models with Python ). + +### Unemployment Dynamics + +Consider an economy where in any given year, $\alpha = 5\%$ of workers lose their jobs and +$\phi = 10\%$ of unemployed workers find jobs. + +Define the vector $x_0 = \begin{bmatrix} 900,000 & 100,000 \end{bmatrix}$ as the number of +employed and unemployed workers (respectively) at time $0$ in the economy. + +Our goal is to determine the dynamics of unemployment in this economy. + +First, let's define the matrix. + +$$ +A = \begin{bmatrix} 1 - \alpha & \alpha \\ \phi & 1 - \phi \end{bmatrix} +$$ + +Note that with this definition, we can describe the evolution of employment and unemployment +from $x_0$ to $x_1$ using linear algebra. + +$$ +x_1 = \begin{bmatrix} (1 - \alpha) 900,000 + \phi 100,000 \\ \alpha 900,000 + (1-\phi) 100,000\end{bmatrix} = A' x_0 +$$ + +However, since the transitions do not change over time, we can use this to describe the evolution +from any arbitrary time $t$, so that + +$$ +x_{t+1} = A' x_t +$$ + +Let's code up a python function that will let us track the evolution of unemployment over time. + +```{code-cell} python +phi = 0.1 +alpha = 0.05 + +x0 = np.array([900_000, 100_000]) + +A = np.array([[1-alpha, alpha], [phi, 1-phi]]) + +def simulate(x0, A, T=10): + """ + Simulate the dynamics of unemployment for T periods starting from x0 + and using values of A for probabilities of moving between employment + and unemployment + """ + nX = x0.shape[0] + out = np.zeros((T, nX)) + out[0, :] = x0 + + for t in range(1, T): + out[t, :] = A.T @ out[t-1, :] + + return out +``` + +Let's use this function to plot unemployment and employment levels for 10 periods. + +```{code-cell} python +def plot_simulation(x0, A, T=100): + X = simulate(x0, A, T) + fig, ax = plt.subplots() + ax.plot(X[:, 0]) + ax.plot(X[:, 1]) + ax.set_xlabel("t") + ax.legend(["Employed", "Unemployed"]) + return ax + +plot_simulation(x0, A, 50) +``` + +Notice that the levels of unemployed an employed workers seem to be heading to constant numbers. + +We refer to this phenomenon as *convergence* because the values appear to converge to a constant +number. + +Let's check that the values are permanently converging. + +```{code-cell} python +plot_simulation(x0, A, 5000) +``` + +The convergence of this system is a property determined by the matrix $A$. + +The long-run distribution of employed and unemployed workers is equal to the largest [eigenvector](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) +of $A'$, corresponding to the eigenvalue equal to 1. An eigenvalue of $A'$ is also known as a "left-eigenvector" of A. + +Let's have numpy compute the eigenvalues and eigenvectors and compare the results to our simulated results above: + +```{code-cell} python +eigvals, eigvecs = np.linalg.eig(A.T) +for i in range(len(eigvals)): + if eigvals[i] == 1: + which_eig = i + break + +print(f"We are looking for eigenvalue {which_eig}") +``` + +Now let's look at the corresponding eigenvector: + +```{code-cell} python +dist = eigvecs[:, which_eig] + +# need to divide by sum so it adds to 1 +dist /= dist.sum() + +print(f"The distribution of workers is given by {dist}") +``` + + +````{admonition} Exercise +:name: dir3-3-3 + +See exercise 3 in the {ref}`exercise list `. +```` + +(ex3-3)= +## Exercises + +### Exercise 1 + +Alice is a stock broker who owns two types of assets: A and B. She owns 100 +units of asset A and 50 units of asset B. The current interest rate is 5%. +Each of the A assets have a remaining duration of 6 years and pay +\$1500 each year, while each of the B assets have a remaining duration +of 4 years and pay \$500 each year. Alice would like to retire if she +can sell her assets for more than \$500,000. Use vector addition, scalar +multiplication, and dot products to determine whether she can retire. + +({ref}`back to text `) + +### Exercise 2 + +Which of the following operations will work and which will +create errors because of size issues? + +Test out your intuitions in the code cell below + +```{code-block} python +x1 @ x2 +x2 @ x1 +x2 @ x3 +x3 @ x2 +x1 @ x3 +x4 @ y1 +x4 @ y2 +y1 @ x4 +y2 @ x4 +``` + +```{code-cell} python +# testing area +``` + +({ref}`back to text `) + +### Exercise 3 + +Compare the distribution above to the final values of a long simulation. + +If you multiply the distribution by 1,000,000 (the number of workers), do you get (roughly) the same number as the simulation? + +```{code-cell} python +# your code here +``` + +({ref}`back to text `) diff --git a/lectures/scientific/index.md b/lectures/scientific/index.md index 2cf55671..d341f041 100644 --- a/lectures/scientific/index.md +++ b/lectures/scientific/index.md @@ -1,46 +1,46 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Scientific Computing - -This section discusses several key aspects of scientific computing that enable modern economics, data science, and statistics. - -As the size of our data and the complexity of our models have increased (and continue doing so), we have become more reliant on computers to perform computations that we simply cannot do by hand. - -In this section, we will cover - -- Python's main numerical library numpy and how to work with its array type. -- A basic introduction to visualizing data with matplotlib. -- A refresher on some key linear algebra concepts. -- A review of basic probability concepts and how to use simulation in learning economics. -- Using a computer to perform optimization. - -Many of the tools learned in this section will continue to show up throughout the -{doc}`pandas <../pandas/index>` and {doc}`applications <../applications/index>` sections. - -```{warning} -This section has more formal math than the previous material (and there will be more -math as you cover certain methods). - -We expect that students' mathematical backgrounds will range widely, so for those who have slightly less preparation, please don't let this scare you. - -We have found that although understanding these tools will require some extra effort, it will give you a leg up in almost any career you might consider. -``` - -## [Introduction to Numpy](../scientific/numpy_arrays.md) - -## [Plotting](../scientific/plotting.md) - -## [Applied Linear Algebra](../scientific/applied_linalg.md) - -## [Randomness](../scientific/randomness.md) - +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Scientific Computing + +This section discusses several key aspects of scientific computing that enable modern economics, data science, and statistics. + +As the size of our data and the complexity of our models have increased (and continue doing so), we have become more reliant on computers to perform computations that we simply cannot do by hand. + +In this section, we will cover + +- Python's main numerical library numpy and how to work with its array type. +- A basic introduction to visualizing data with matplotlib. +- A refresher on some key linear algebra concepts. +- A review of basic probability concepts and how to use simulation in learning economics. +- Using a computer to perform optimization. + +Many of the tools learned in this section will continue to show up throughout the +{doc}`pandas <../pandas/index>` and {doc}`applications <../applications/index>` sections. + +```{warning} +This section has more formal math than the previous material (and there will be more +math as you cover certain methods). + +We expect that students' mathematical backgrounds will range widely, so for those who have slightly less preparation, please don't let this scare you. + +We have found that although understanding these tools will require some extra effort, it will give you a leg up in almost any career you might consider. +``` + +## [Introduction to Numpy](../scientific/numpy_arrays.md) + +## [Plotting](../scientific/plotting.md) + +## [Applied Linear Algebra](../scientific/applied_linalg.md) + +## [Randomness](../scientific/randomness.md) + ## [Optimization](../scientific/optimization.md) \ No newline at end of file diff --git a/lectures/scientific/numpy_arrays.md b/lectures/scientific/numpy_arrays.md index a01addc6..65883f80 100644 --- a/lectures/scientific/numpy_arrays.md +++ b/lectures/scientific/numpy_arrays.md @@ -1,552 +1,552 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Introduction to Numpy - -**Prerequisites** - -- {doc}`Python Fundamentals <../python_fundamentals/index>` - -**Outcomes** - -- Understand basics about numpy arrays -- Index into multi-dimensional arrays -- Use universal functions/broadcasting to do element-wise operations on arrays - - -## Numpy Arrays - -Now that we have learned the fundamentals of programming in Python, we will learn how we can use Python -to perform the computations required in data science and economics. We call these the "scientific Python tools". - -The foundational library that helps us perform these computations is known as `numpy` (numerical -Python). - -Numpy's core contribution is a new data-type called an *array*. - -An array is similar to a list, but numpy imposes some additional restrictions on how the data inside is organized. - -These restrictions allow numpy to - -1. Be more efficient in performing mathematical and scientific computations. -1. Expose functions that allow numpy to do the necessary linear algebra for machine learning and statistics. - -Before we get started, please note that the convention for importing the numpy package is to use the -nickname `np`: - -```{code-cell} python -import numpy as np -``` - -### What is an Array? - -An array is a multi-dimensional grid of values. - -What does this mean? It is easier to demonstrate than to explain. - -In this block of code, we build a 1-dimensional array. - -```{code-cell} python -# create an array from a list -x_1d = np.array([1, 2, 3]) -print(x_1d) -``` - -You can think of a 1-dimensional array as a list of numbers. - -```{code-cell} python -# We can index like we did with lists -print(x_1d[0]) -print(x_1d[0:2]) -``` - -Note that the range of indices does not include the end-point, that -is - -```{code-cell} python -print(x_1d[0:3] == x_1d[:]) -print(x_1d[0:2]) -``` - -The differences emerge as we move into higher dimensions. - -Next, we define a 2-dimensional array (a matrix) - -```{code-cell} python -x_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) -print(x_2d) -``` - -Notice that the data is no longer represented as something flat, but rather, -as three rows and three columns of numbers. - -The first question that you might ask yourself is: "how do I access the values in this array?" - -You access each element by specifying a row first and then a column. For -example, if we wanted to access the `6`, we would ask for the (1, 2) element. - -```{code-cell} python -print(x_2d[1, 2]) # Indexing into two dimensions! -``` - -Or to get the top left corner... - -```{code-cell} python -print(x_2d[0, 0]) # Indexing into two dimensions! -``` - -To get the first, and then second rows... - -```{code-cell} python -print(x_2d[0, :]) -print(x_2d[1, :]) -``` - -Or the columns... - -```{code-cell} python -print(x_2d[:, 0]) -print(x_2d[:, 1]) -``` - -This continues to generalize, since numpy gives us as many dimensions as we want in an array. - -For example, we build a 3-dimensional array below. - -```{code-cell} python -x_3d_list = [[[1, 2, 3], [4, 5, 6]], [[10, 20, 30], [40, 50, 60]]] -x_3d = np.array(x_3d_list) -print(x_3d) -``` - -#### Array Indexing - -Now that there are multiple dimensions, indexing might feel somewhat non-obvious. - -Do the rows or columns come first? In higher dimensions, what is the order of -the index? - -Notice that the array is built using a list of lists (you could also use tuples!). - -Indexing into the array will correspond to choosing elements from each list. - -First, notice that the dimensions give two stacked matrices, which we can access with - -```{code-cell} python -print(x_3d[0]) -print(x_3d[1]) -``` - -In the case of the first, it is synonymous with - -```{code-cell} python -print(x_3d[0, :, :]) -``` - -Let's work through another example to further clarify this concept with our -3-dimensional array. - -Our goal will be to find the index that retrieves the `4` out of `x_3d`. - -Recall that when we created `x_3d`, we used the list `[[[1, 2, 3], [4, 5, 6]], [[10, 20, 30], [40, 50, 60]]]`. - -Notice that the 0 element of that list is `[[1, 2, 3], [4, 5, 6]]`. This is the -list that contains the `4` so the first index we would use is a 0. - -```{code-cell} python -print(f"The 0 element is {x_3d_list[0]}") -print(f"The 1 element is {x_3d_list[1]}") -``` - -We then move to the next lists which were the 0 element of the inner-most dimension. Notice that -the two lists at this level `[1, 2, 3]` and `[3, 4, 5]`. - -The 4 is in the second 1 element (index `1`), so the second index we would choose is 1. - -```{code-cell} python -print(f"The 0 element of the 0 element is {x_3d_list[0][0]}") -print(f"The 1 element of the 0 element is {x_3d_list[0][1]}") -``` - -Finally, we move to the outer-most dimension, which has a list of numbers -`[4, 5, 6]`. - -The 4 is element 0 of this list, so the third, or outer-most index, would be `0`. - -```{code-cell} python -print(f"The 0 element of the 1 element of the 0 element is {x_3d_list[0][1][0]}") -``` - -Now we can use these same indices to index into the array. With an array, we can index using a single operation rather than repeated indexing as we did with the list `x_3d_list[0][1][0]`. - -Let's test it to see whether we did it correctly! - -```{code-cell} python -print(x_3d[0, 1, 0]) -``` - -Success! - -````{admonition} Exercise -:name: dir3-1-1 - -See exercise 1 in the {ref}`exercise list `. -```` - -````{admonition} Exercise -:name: dir3-1-2 - -See exercise 2 in the {ref}`exercise list `. -```` - -We can also select multiple elements at a time -- this is called slicing. - -If we wanted to have an array with just `[1, 2, 3]` then we would do - -```{code-cell} python -print(x_3d[0, 0, :]) -``` - -Notice that we put a `:` on the dimension where we want to select all of the elements. We can also -slice out subsets of the elements by doing `start:stop+1`. - -Notice how the following arrays differ. - -```{code-cell} python -print(x_3d[:, 0, :]) -print(x_3d[:, 0, 0:2]) -print(x_3d[:, 0, :2]) # the 0 in 0:2 is optional -``` - -````{admonition} Exercise -:name: dir3-1-3 - -See exercise 3 in the {ref}`exercise list `. -```` - - -### Array Functionality - -#### Array Properties - -All numpy arrays have various useful properties. - -Properties are similar to methods in that they're accessed through -the "dot notation." However, they aren't a function so we don't need parentheses. - -The two most frequently used properties are `shape` and `dtype`. - -`shape` tells us how many elements are in each array dimension. - -`dtype` tells us the types of an array's elements. - -Let's do some examples to see these properties in action. - -```{code-cell} python -x = np.array([[1, 2, 3], [4, 5, 6]]) -print(x.shape) -print(x.dtype) -``` - -We'll use this to practice unpacking a tuple, like `x.shape`, directly into variables. - -```{code-cell} python -rows, columns = x.shape -print(f"rows = {rows}, columns = {columns}") -``` - -```{code-cell} python -x = np.array([True, False, True]) -print(x.shape) -print(x.dtype) -``` - -Note that in the above, the `(3,)` represents a tuple of length 1, distinct from a scalar integer `3`. - -```{code-cell} python -x = np.array([ - [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], - [[7.0, 8.0], [9.0, 10.0], [11.0, 12.0]] -]) -print(x.shape) -print(x.dtype) -``` - -#### Creating Arrays - -It's usually impractical to define arrays by hand as we have done so far. - -We'll often need to create an array with default values and then fill it -with other values. - -We can create arrays with the functions `np.zeros` and `np.ones`. - -Both functions take a tuple that denotes the shape of an array and creates an -array filled with 0s or 1s respectively. - -```{code-cell} python -sizes = (2, 3, 4) -x = np.zeros(sizes) # note, a tuple! -x -``` - -```{code-cell} python -y = np.ones((4)) -y -``` - -#### Broadcasting Operations - -Two types of operations that will be useful for arrays of any dimension are: - -1. Operations between an array and a single number. -1. Operations between two arrays of the same shape. - -When we perform operations on an array by using a single number, we simply apply that operation to every element of the array. - -```{code-cell} python -# Using np.ones to create an array -x = np.ones((2, 2)) -print("x = ", x) -print("2 + x = ", 2 + x) -print("2 - x = ", 2 - x) -print("2 * x = ", 2 * x) -print("x / 2 = ", x / 2) -``` - - -````{admonition} Exercise -:name: dir3-1-4 - -See exercise 4 in the {ref}`exercise list `. -```` - -Operations between two arrays of the same size, in this case `(2, 2)`, simply apply the operation -element-wise between the arrays. - -```{code-cell} python -x = np.array([[1.0, 2.0], [3.0, 4.0]]) -y = np.ones((2, 2)) -print("x = ", x) -print("y = ", y) -print("x + y = ", x + y) -print("x - y", x - y) -print("(elementwise) x * y = ", x * y) -print("(elementwise) x / y = ", x / y) -``` - -### Universal Functions - -We will often need to transform data by applying a function to every element of an array. - -Numpy has good support for these operations, called *universal functions* or ufuncs for short. - -The -[numpy documentation](https://docs.scipy.org/doc/numpy/reference/ufuncs.html?highlight=ufunc#available-ufuncs) -has a list of all available ufuncs. - -```{note} -You should think of operations between a single number and an array, as we -just saw, as a ufunc. -``` - -Below, we will create an array that contains 10 points between 0 and 25. - -```{code-cell} python -# This is similar to range -- but spits out 50 evenly spaced points from 0.5 -# to 25. -x = np.linspace(0.5, 25, 10) -``` - -We will experiment with some ufuncs below: - -```{code-cell} python -# Applies the sin function to each element of x -np.sin(x) -``` - -Of course, we could do the same thing with a comprehension, but -the code would be both less readable and less efficient. - -```{code-cell} python -np.array([np.sin(xval) for xval in x]) -``` - -You can use the inspector or the docstrings with `np.` to see other available functions, such as - -```{code-cell} python -# Takes log of each element of x -np.log(x) -``` - -A benefit of using the numpy arrays is that numpy has succinct code for combining vectorized operations. - -```{code-cell} python -# Calculate log(z) * z elementwise -z = np.array([1,2,3]) -np.log(z) * z -``` - -````{admonition} Exercise -:name: dir3-1-5 - -See exercise 5 in the {ref}`exercise list `. -```` - -### Other Useful Array Operations - -We have barely scratched the surface of what is possible using numpy arrays. - -We hope you will experiment with other functions from numpy and see how they -work. - -Below, we demonstrate a few more array operations that we find most useful -- just to give you an idea -of what else you might find. - -When you're attempting to do an operation that you feel should be common, the numpy library probably has it. - -Use Google and tab completion to check this. - -```{code-cell} python -x = np.linspace(0, 25, 10) -``` - -```{code-cell} python -np.mean(x) -``` - -```{code-cell} python -np.std(x) -``` - -```{code-cell} python -# np.min, np.median, etc... are also defined -np.max(x) -``` - -```{code-cell} python -np.diff(x) -``` - -```{code-cell} python -np.reshape(x, (5, 2)) -``` - -Note that many of these operations can be called as methods on `x`: - -```{code-cell} python -print(x.mean()) -print(x.std()) -print(x.max()) -# print(x.diff()) # this one is not a method... -print(x.reshape((5, 2))) -``` - -Finally, `np.vectorize` can be conveniently used with numpy broadcasting and any functions. - -```{code-cell} python -np.random.seed(42) -x = np.random.rand(10) -print(x) - -def f(val): - if val < 0.3: - return "low" - else: - return "high" - -print(f(0.1)) # scalar, no problem -# f(x) # array, fails since f() is scalar -f_vec = np.vectorize(f) -print(f_vec(x)) -``` - -Caution: `np.vectorize` is convenient for numpy broadcasting with any function -but is not intended to be high performance. - -When speed matters, directly write a `f` function to work on arrays. - -(ex3-1)= -## Exercises - -### Exercise 1 - -Try indexing into another element of your choice from the -3-dimensional array. - -Building an understanding of indexing means working through this -type of operation several times -- without skipping steps! - -({ref}`back to text `) - -### Exercise 2 - -Look at the 2-dimensional array `x_2d`. - -Does the inner-most index correspond to rows or columns? What does the -outer-most index correspond to? - -Write your thoughts. - -({ref}`back to text `) - -### Exercise 3 - -What would you do to extract the array `[[5, 6], [50, 60]]`? - -({ref}`back to text `) - -### Exercise 4 - -Do you recall what multiplication by an integer did for lists? - -How does this differ? - -({ref}`back to text `) - -### Exercise 5 - -Let's revisit a bond pricing example we saw in {doc}`Control flow <../python_fundamentals/control_flow>`. - -Recall that the equation for pricing a bond with coupon payment $C$, -face value $M$, yield to maturity $i$, and periods to maturity -$N$ is - -$$ -\begin{align*} - P &= \left(\sum_{n=1}^N \frac{C}{(i+1)^n}\right) + \frac{M}{(1+i)^N} \\ - &= C \left(\frac{1 - (1+i)^{-N}}{i} \right) + M(1+i)^{-N} -\end{align*} -$$ - -In the code cell below, we have defined variables for `i`, `M` and `C`. - -You have two tasks: - -1. Define a numpy array `N` that contains all maturities between 1 and 10 - - ```{hint} - look at the `np.arange` function. - ``` - -1. Using the equation above, determine the bond prices of all maturity levels in your array. - -```{code-cell} python -i = 0.03 -M = 100 -C = 5 - -# Define array here - -# price bonds here -``` - +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Introduction to Numpy + +**Prerequisites** + +- {doc}`Python Fundamentals <../python_fundamentals/index>` + +**Outcomes** + +- Understand basics about numpy arrays +- Index into multi-dimensional arrays +- Use universal functions/broadcasting to do element-wise operations on arrays + + +## Numpy Arrays + +Now that we have learned the fundamentals of programming in Python, we will learn how we can use Python +to perform the computations required in data science and economics. We call these the "scientific Python tools". + +The foundational library that helps us perform these computations is known as `numpy` (numerical +Python). + +Numpy's core contribution is a new data-type called an *array*. + +An array is similar to a list, but numpy imposes some additional restrictions on how the data inside is organized. + +These restrictions allow numpy to + +1. Be more efficient in performing mathematical and scientific computations. +1. Expose functions that allow numpy to do the necessary linear algebra for machine learning and statistics. + +Before we get started, please note that the convention for importing the numpy package is to use the +nickname `np`: + +```{code-cell} python +import numpy as np +``` + +### What is an Array? + +An array is a multi-dimensional grid of values. + +What does this mean? It is easier to demonstrate than to explain. + +In this block of code, we build a 1-dimensional array. + +```{code-cell} python +# create an array from a list +x_1d = np.array([1, 2, 3]) +print(x_1d) +``` + +You can think of a 1-dimensional array as a list of numbers. + +```{code-cell} python +# We can index like we did with lists +print(x_1d[0]) +print(x_1d[0:2]) +``` + +Note that the range of indices does not include the end-point, that +is + +```{code-cell} python +print(x_1d[0:3] == x_1d[:]) +print(x_1d[0:2]) +``` + +The differences emerge as we move into higher dimensions. + +Next, we define a 2-dimensional array (a matrix) + +```{code-cell} python +x_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) +print(x_2d) +``` + +Notice that the data is no longer represented as something flat, but rather, +as three rows and three columns of numbers. + +The first question that you might ask yourself is: "how do I access the values in this array?" + +You access each element by specifying a row first and then a column. For +example, if we wanted to access the `6`, we would ask for the (1, 2) element. + +```{code-cell} python +print(x_2d[1, 2]) # Indexing into two dimensions! +``` + +Or to get the top left corner... + +```{code-cell} python +print(x_2d[0, 0]) # Indexing into two dimensions! +``` + +To get the first, and then second rows... + +```{code-cell} python +print(x_2d[0, :]) +print(x_2d[1, :]) +``` + +Or the columns... + +```{code-cell} python +print(x_2d[:, 0]) +print(x_2d[:, 1]) +``` + +This continues to generalize, since numpy gives us as many dimensions as we want in an array. + +For example, we build a 3-dimensional array below. + +```{code-cell} python +x_3d_list = [[[1, 2, 3], [4, 5, 6]], [[10, 20, 30], [40, 50, 60]]] +x_3d = np.array(x_3d_list) +print(x_3d) +``` + +#### Array Indexing + +Now that there are multiple dimensions, indexing might feel somewhat non-obvious. + +Do the rows or columns come first? In higher dimensions, what is the order of +the index? + +Notice that the array is built using a list of lists (you could also use tuples!). + +Indexing into the array will correspond to choosing elements from each list. + +First, notice that the dimensions give two stacked matrices, which we can access with + +```{code-cell} python +print(x_3d[0]) +print(x_3d[1]) +``` + +In the case of the first, it is synonymous with + +```{code-cell} python +print(x_3d[0, :, :]) +``` + +Let's work through another example to further clarify this concept with our +3-dimensional array. + +Our goal will be to find the index that retrieves the `4` out of `x_3d`. + +Recall that when we created `x_3d`, we used the list `[[[1, 2, 3], [4, 5, 6]], [[10, 20, 30], [40, 50, 60]]]`. + +Notice that the 0 element of that list is `[[1, 2, 3], [4, 5, 6]]`. This is the +list that contains the `4` so the first index we would use is a 0. + +```{code-cell} python +print(f"The 0 element is {x_3d_list[0]}") +print(f"The 1 element is {x_3d_list[1]}") +``` + +We then move to the next lists which were the 0 element of the inner-most dimension. Notice that +the two lists at this level `[1, 2, 3]` and `[3, 4, 5]`. + +The 4 is in the second 1 element (index `1`), so the second index we would choose is 1. + +```{code-cell} python +print(f"The 0 element of the 0 element is {x_3d_list[0][0]}") +print(f"The 1 element of the 0 element is {x_3d_list[0][1]}") +``` + +Finally, we move to the outer-most dimension, which has a list of numbers +`[4, 5, 6]`. + +The 4 is element 0 of this list, so the third, or outer-most index, would be `0`. + +```{code-cell} python +print(f"The 0 element of the 1 element of the 0 element is {x_3d_list[0][1][0]}") +``` + +Now we can use these same indices to index into the array. With an array, we can index using a single operation rather than repeated indexing as we did with the list `x_3d_list[0][1][0]`. + +Let's test it to see whether we did it correctly! + +```{code-cell} python +print(x_3d[0, 1, 0]) +``` + +Success! + +````{admonition} Exercise +:name: dir3-1-1 + +See exercise 1 in the {ref}`exercise list `. +```` + +````{admonition} Exercise +:name: dir3-1-2 + +See exercise 2 in the {ref}`exercise list `. +```` + +We can also select multiple elements at a time -- this is called slicing. + +If we wanted to have an array with just `[1, 2, 3]` then we would do + +```{code-cell} python +print(x_3d[0, 0, :]) +``` + +Notice that we put a `:` on the dimension where we want to select all of the elements. We can also +slice out subsets of the elements by doing `start:stop+1`. + +Notice how the following arrays differ. + +```{code-cell} python +print(x_3d[:, 0, :]) +print(x_3d[:, 0, 0:2]) +print(x_3d[:, 0, :2]) # the 0 in 0:2 is optional +``` + +````{admonition} Exercise +:name: dir3-1-3 + +See exercise 3 in the {ref}`exercise list `. +```` + + +### Array Functionality + +#### Array Properties + +All numpy arrays have various useful properties. + +Properties are similar to methods in that they're accessed through +the "dot notation." However, they aren't a function so we don't need parentheses. + +The two most frequently used properties are `shape` and `dtype`. + +`shape` tells us how many elements are in each array dimension. + +`dtype` tells us the types of an array's elements. + +Let's do some examples to see these properties in action. + +```{code-cell} python +x = np.array([[1, 2, 3], [4, 5, 6]]) +print(x.shape) +print(x.dtype) +``` + +We'll use this to practice unpacking a tuple, like `x.shape`, directly into variables. + +```{code-cell} python +rows, columns = x.shape +print(f"rows = {rows}, columns = {columns}") +``` + +```{code-cell} python +x = np.array([True, False, True]) +print(x.shape) +print(x.dtype) +``` + +Note that in the above, the `(3,)` represents a tuple of length 1, distinct from a scalar integer `3`. + +```{code-cell} python +x = np.array([ + [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], + [[7.0, 8.0], [9.0, 10.0], [11.0, 12.0]] +]) +print(x.shape) +print(x.dtype) +``` + +#### Creating Arrays + +It's usually impractical to define arrays by hand as we have done so far. + +We'll often need to create an array with default values and then fill it +with other values. + +We can create arrays with the functions `np.zeros` and `np.ones`. + +Both functions take a tuple that denotes the shape of an array and creates an +array filled with 0s or 1s respectively. + +```{code-cell} python +sizes = (2, 3, 4) +x = np.zeros(sizes) # note, a tuple! +x +``` + +```{code-cell} python +y = np.ones((4)) +y +``` + +#### Broadcasting Operations + +Two types of operations that will be useful for arrays of any dimension are: + +1. Operations between an array and a single number. +1. Operations between two arrays of the same shape. + +When we perform operations on an array by using a single number, we simply apply that operation to every element of the array. + +```{code-cell} python +# Using np.ones to create an array +x = np.ones((2, 2)) +print("x = ", x) +print("2 + x = ", 2 + x) +print("2 - x = ", 2 - x) +print("2 * x = ", 2 * x) +print("x / 2 = ", x / 2) +``` + + +````{admonition} Exercise +:name: dir3-1-4 + +See exercise 4 in the {ref}`exercise list `. +```` + +Operations between two arrays of the same size, in this case `(2, 2)`, simply apply the operation +element-wise between the arrays. + +```{code-cell} python +x = np.array([[1.0, 2.0], [3.0, 4.0]]) +y = np.ones((2, 2)) +print("x = ", x) +print("y = ", y) +print("x + y = ", x + y) +print("x - y", x - y) +print("(elementwise) x * y = ", x * y) +print("(elementwise) x / y = ", x / y) +``` + +### Universal Functions + +We will often need to transform data by applying a function to every element of an array. + +Numpy has good support for these operations, called *universal functions* or ufuncs for short. + +The +[numpy documentation](https://docs.scipy.org/doc/numpy/reference/ufuncs.html?highlight=ufunc#available-ufuncs) +has a list of all available ufuncs. + +```{note} +You should think of operations between a single number and an array, as we +just saw, as a ufunc. +``` + +Below, we will create an array that contains 10 points between 0 and 25. + +```{code-cell} python +# This is similar to range -- but spits out 50 evenly spaced points from 0.5 +# to 25. +x = np.linspace(0.5, 25, 10) +``` + +We will experiment with some ufuncs below: + +```{code-cell} python +# Applies the sin function to each element of x +np.sin(x) +``` + +Of course, we could do the same thing with a comprehension, but +the code would be both less readable and less efficient. + +```{code-cell} python +np.array([np.sin(xval) for xval in x]) +``` + +You can use the inspector or the docstrings with `np.` to see other available functions, such as + +```{code-cell} python +# Takes log of each element of x +np.log(x) +``` + +A benefit of using the numpy arrays is that numpy has succinct code for combining vectorized operations. + +```{code-cell} python +# Calculate log(z) * z elementwise +z = np.array([1,2,3]) +np.log(z) * z +``` + +````{admonition} Exercise +:name: dir3-1-5 + +See exercise 5 in the {ref}`exercise list `. +```` + +### Other Useful Array Operations + +We have barely scratched the surface of what is possible using numpy arrays. + +We hope you will experiment with other functions from numpy and see how they +work. + +Below, we demonstrate a few more array operations that we find most useful -- just to give you an idea +of what else you might find. + +When you're attempting to do an operation that you feel should be common, the numpy library probably has it. + +Use Google and tab completion to check this. + +```{code-cell} python +x = np.linspace(0, 25, 10) +``` + +```{code-cell} python +np.mean(x) +``` + +```{code-cell} python +np.std(x) +``` + +```{code-cell} python +# np.min, np.median, etc... are also defined +np.max(x) +``` + +```{code-cell} python +np.diff(x) +``` + +```{code-cell} python +np.reshape(x, (5, 2)) +``` + +Note that many of these operations can be called as methods on `x`: + +```{code-cell} python +print(x.mean()) +print(x.std()) +print(x.max()) +# print(x.diff()) # this one is not a method... +print(x.reshape((5, 2))) +``` + +Finally, `np.vectorize` can be conveniently used with numpy broadcasting and any functions. + +```{code-cell} python +np.random.seed(42) +x = np.random.rand(10) +print(x) + +def f(val): + if val < 0.3: + return "low" + else: + return "high" + +print(f(0.1)) # scalar, no problem +# f(x) # array, fails since f() is scalar +f_vec = np.vectorize(f) +print(f_vec(x)) +``` + +Caution: `np.vectorize` is convenient for numpy broadcasting with any function +but is not intended to be high performance. + +When speed matters, directly write a `f` function to work on arrays. + +(ex3-1)= +## Exercises + +### Exercise 1 + +Try indexing into another element of your choice from the +3-dimensional array. + +Building an understanding of indexing means working through this +type of operation several times -- without skipping steps! + +({ref}`back to text `) + +### Exercise 2 + +Look at the 2-dimensional array `x_2d`. + +Does the inner-most index correspond to rows or columns? What does the +outer-most index correspond to? + +Write your thoughts. + +({ref}`back to text `) + +### Exercise 3 + +What would you do to extract the array `[[5, 6], [50, 60]]`? + +({ref}`back to text `) + +### Exercise 4 + +Do you recall what multiplication by an integer did for lists? + +How does this differ? + +({ref}`back to text `) + +### Exercise 5 + +Let's revisit a bond pricing example we saw in {doc}`Control flow <../python_fundamentals/control_flow>`. + +Recall that the equation for pricing a bond with coupon payment $C$, +face value $M$, yield to maturity $i$, and periods to maturity +$N$ is + +$$ +\begin{align*} + P &= \left(\sum_{n=1}^N \frac{C}{(i+1)^n}\right) + \frac{M}{(1+i)^N} \\ + &= C \left(\frac{1 - (1+i)^{-N}}{i} \right) + M(1+i)^{-N} +\end{align*} +$$ + +In the code cell below, we have defined variables for `i`, `M` and `C`. + +You have two tasks: + +1. Define a numpy array `N` that contains all maturities between 1 and 10 + + ```{hint} + look at the `np.arange` function. + ``` + +1. Using the equation above, determine the bond prices of all maturity levels in your array. + +```{code-cell} python +i = 0.03 +M = 100 +C = 5 + +# Define array here + +# price bonds here +``` + ({ref}`back to text `) \ No newline at end of file diff --git a/lectures/scientific/optimization.md b/lectures/scientific/optimization.md index 18f1016d..47a73ff7 100644 --- a/lectures/scientific/optimization.md +++ b/lectures/scientific/optimization.md @@ -1,464 +1,464 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Optimization - -**Prerequisites** - -- {doc}`Introduction to Numpy ` -- {doc}`Applied Linear Algebra ` - -**Outcomes** - -- Perform optimization by hand using derivatives -- Understand ideas from gradient descent - - -```{literalinclude} ../_static/colab_light.raw -``` - -```{code-cell} python -# imports for later -import numpy as np -import matplotlib.pyplot as plt -%matplotlib inline -``` - -## What is Optimization? - -Optimization is the branch of mathematics focused on finding extreme values (max or min) of -functions. - -Optimization tools will appear in many places throughout this course, including: - -- Building economic models in which individuals make decisions that maximize their utility. -- Building statistical models and maximizing the fit of these models by optimizing certain fit - functions. - -In this lecture, we will focus mostly on the first to limit the moving pieces, but in other lectures, we'll discuss the second in detail. - -### Derivatives and Optima - -Here, we revisit some of the theory that you have already learned in your calculus class. - -Consider function $f(x)$ which maps a number into another number. We can say that any point -where $f'(x) = 0$ is a local extremum of $f$. - -Let's work through an example. Consider the function - -$$ -f(x) = x^4 - 3 x^2 -$$ - -Its derivative is given by - -$$ -\frac{\partial f}{\partial x} = 4 x^3 - 6 x -$$ - -Let's plot the function and its derivative to pick out the local extremum by hand. - -```{code-cell} python -def f(x): - return x**4 - 3*x**2 - - -def fp(x): - return 4*x**3 - 6*x - -# Create 100 evenly spaced points between -2 and 2 -x = np.linspace(-2., 2., 100) - -# Evaluate the functions at x values -fx = f(x) -fpx = fp(x) - -# Create plot -fig, ax = plt.subplots(1, 2) - -ax[0].plot(x, fx) -ax[0].set_title("Function") - -ax[1].plot(x, fpx) -ax[1].hlines(0.0, -2.5, 2.5, color="k", linestyle="--") -ax[1].set_title("Derivative") - -for _ax in ax: - _ax.spines["right"].set_visible(False) - _ax.spines["top"].set_visible(False) -``` - -If you stare at this picture, you can probably determine the the local maximum is at -$x = 0$ and the local minima at $x \approx -1$ and $x \approx 1$. - -To properly determine the minima and maxima, we find the solutions to $f'(x) = 0$ below: - -$$ -f'(x) = 4 x^3 - 6 x = 0 -$$ - -$$ -\rightarrow x = \left\{0, \frac{\sqrt{6}}{2}, \frac{-\sqrt{6}}{2} \right\} -$$ - -Let's check whether we can get the same answers with Python! To do this, we import a new -package that we haven't seen yet. - -```{code-cell} python -import scipy.optimize as opt -``` - -Then using the function definitions from earlier, we search for the minimum and maximum values. - -```{code-cell} python -# For a scalar problem, we give it the function and the bounds between -# which we want to search -neg_min = opt.minimize_scalar(f, [-2, -0.5]) -pos_min = opt.minimize_scalar(f, [0.5, 2.0]) -print("The negative minimum is: \n", neg_min) -print("The positive minimum is: \n", pos_min) -``` - -The scipy optimize package only has functions that find minimums... You might be wondering, then, how we -will verify our maximum value. - -It turns out that finding the maximum is equivalent to simply finding the minimum of the negative function. - -```{code-cell} python -# Create a function that evaluates to negative f -def neg_f(x): - return -f(x) - -max_out = opt.minimize_scalar(neg_f, [-0.35, 0.35]) -print("The maximum is: \n", max_out) -``` - -We won't dive into the details of optimization algorithms in this lecture, but we'll impart some brief -intuition to help you understand the types of problems these algorithms are good at solving and -the types of problems they will struggle with: - -The general intuition is that when you're finding a maximum, an algorithm takes a step -in the direction of the derivative... (Conversely, to find a minimum, the algorithm takes a step opposite the direction of the derivative.) -This requires the function to be relatively smooth and continuous. The algorithm also has an easier time if there is only one (or very few) extremum to be found... - -For minimization, you can imagine the algorithm as a marble in a bowl. - -The marble will keep rolling down the slope of the bowl until it finds the bottom. - -It may overshoot, but once it hits the slope on the other side, it will continue to roll back -and forth until it comes to rest. - -Thus, when deciding whether numerical optimization is an effective method for a -particular problem, you could try visualizing the function to determine whether a marble -would be able to come to rest at the extreme values you are looking for. - -### Application: Consumer Theory - -A common use of maximization in economics is to model -optimal consumption decisions . - -#### Preferences and Utility Functions - -To summarize introductory economics, take a set of -[preferences](https://en.wikipedia.org/wiki/Preference_%28economics%29) of consumers over "bundles" -of goods (e.g. 2 apples and 3 oranges is preferred to 3 apples and 2 oranges, or a 100% chance to -win $1$ dollar is preferred to a 50% chance to win $2.10$ dollars). - -Under certain assumptions, you rationalize the preferences as a utility function over the different -goods (always remembering that the utility is simply a tool to order preferences and the numbers are -usually not meaningful themselves). - -For example, consider a utility function over bundles of bananas (B) and apples (A) - -$$ -U(B, A) = B^{\alpha}A^{1-\alpha} -$$ - -Where $\alpha \in [0,1]$. - -First, let's take a look at this particular utility function. - -```{code-cell} python -def U(A, B, alpha=1/3): - return B**alpha * A**(1-alpha) - -fig, ax = plt.subplots() -B = 1.5 -A = np.linspace(1, 10, 100) -ax.plot(A, U(A, B)) -ax.set_xlabel("A") -ax.set_ylabel("U(B=1.5, A)") -``` - -We note that - -- $U(B,1)$ is always higher with more B, hence, consuming more bananas has a -: positive marginal utility i.e. $\frac{d U(B,1)}{d B} > 0$. -- The more bananas we consume, the smaller the change in marginal utility, i.e. - $\frac{d^2 U(B,1)}{d B^2} < 0$. - -If we plot both the $B$ and the $A$, we can see how the utility changes with different -bundles. - -```{code-cell} python -fig, ax = plt.subplots() -B = np.linspace(1, 20, 100).reshape((100, 1)) -contours = ax.contourf(A, B.flatten(), U(A, B)) -fig.colorbar(contours) -ax.set_xlabel("A") -ax.set_ylabel("B") -ax.set_title("U(A,B)") -``` - -We can find the bundles between which the consumer would be indifferent by fixing a -utility $\bar{U}$ and by determining all combinations of $A$ and $B$ where -$\bar{U} = U(B, A)$. - -In this example, we can implement this calculation by letting $B$ be the variable on the -x-axis and solving for $A(\bar{U}, B)$ - -$$ -A(B, \bar{U}) = U^{\frac{1}{1-\alpha}}B^{\frac{-\alpha}{1-\alpha}} -$$ - -```{code-cell} python -def A_indifference(B, ubar, alpha=1/3): - return ubar**(1/(1-alpha)) * B**(-alpha/(1-alpha)) - -def plot_indifference_curves(ax, alpha=1/3): - ubar = np.arange(1, 11, 2) - ax.plot(B, A_indifference(B, ubar, alpha)) - ax.legend([r"$\bar{U}$" + " = {}".format(i) for i in ubar]) - ax.set_xlabel("B") - ax.set_ylabel(r"$A(B, \bar{U}$)") - -fig, ax = plt.subplots() -plot_indifference_curves(ax) -``` - -Note that in every case, if you increase either the number of apples or bananas (holding the other -fixed), you reach a higher indifference curve. - -Consequently, in a world without scarcity or budgets, consumers would consume -an arbitrarily high number of both to maximize their utility. - -#### Budget Constraints - -While the above example plots consumer preferences, it says nothing about what the consumers can afford. - -The simplest sort of constraint is a budget constraint where bananas and apples both have a price -and the consumer has a limited amount of funds. - -If the prices per banana and per apple are identical, no matter how many you consume, then the -affordable bundles are simply all pairs of apples and bananas below the line. -$p_a A + p_b B \leq W$. - -For example, if consumer has a budget of $W$, the price of apples is $p_A = 2$ dollars per -apple, and the price of bananas is normalized to be $p_B = 1$ dollar per banana, then the consumer -can afford anything below the line. - -$$ -2 A + B \leq W -$$ - -Or, letting $W = 20$ and plotting - -```{code-cell} python -def A_bc(B, W=20, pa=2): - "Given B, W, and pa return the max amount of A our consumer can afford" - return (W - B) / pa - -def plot_budget_constraint(ax, W=20, pa=2): - B_bc = np.array([0, W]) - A = A_bc(B_bc, W, pa) - ax.plot(B_bc, A) - ax.fill_between(B_bc, 0, A, alpha=0.2) - ax.set_xlabel("B") - ax.set_ylabel("A") - return ax - -fig, ax = plt.subplots() -plot_budget_constraint(ax, 20, 2) -``` - -While the consumer can afford any of the bundles in that area, most will not be optimal. - -#### Optimal Choice - -Putting the budget constraints and the utility functions together lets us visualize the optimal -decision of a consumer. Choose the bundle with the highest possible indifference curve within its -budget set. - -```{code-cell} python -fig, ax = plt.subplots() -plot_indifference_curves(ax) -plot_budget_constraint(ax) -``` - -We have several ways to find the particular point $A, B$ of maximum utility, such as -finding the point where the indifference curve and the budget constraint have the same slope, but a -simple approach is to just solve the direct maximization problem. - -$$ -\begin{aligned} -\max_{A, B} & B^{\alpha}A^{1-\alpha}\\ -\text{s.t. } & p_A A + B \leq W -\end{aligned} -$$ - -Solving this problem directly requires solving a multi-dimensional constrained optimization problem, -where scipy -has several options. - -For this particular problem, we notice two things: (1) The utility function is increasing in both -$A$ and $B$, and (2) there are only 2 goods. - -This allows us 1) to assume that the budget constraint holds at equality, $p_a A + B = W$, 2) to -form a new function $A(B) = (W - B) / p_a$ by rearranging the budget constraint at equality, and -3) to substitute that function directly to form: - -$$ -\max_{B} B^{\alpha}A(B)^{1-\alpha} -$$ - -Compared to before, this problem has been turned into an unconstrained univariate optimization -problem. - -To implement this in code, notice that the $A(B)$ function is what we defined before -as `A_bc`. - -We will solve this by using the function `scipy.optimize.minimize_scalar`, which takes a function -`f(x)` and returns the value of `x` that minimizes `f`. - -```{code-cell} python -from scipy.optimize import minimize_scalar - -def objective(B, W=20, pa=2): - """ - Return value of -U for a given B, when we consume as much A as possible - - Note that we return -U because scipy wants to minimize functions, - and the value of B that minimizes -U will maximize U - """ - A = A_bc(B, W, pa) - return -U(A, B) - -result = minimize_scalar(objective) -optimal_B = result.x -optimal_A = A_bc(optimal_B, 20, 2) -optimal_U = U(optimal_A, optimal_B) - -print("The optimal U is ", optimal_U) -print("and was found at (A,B) =", (optimal_A, optimal_B)) -``` - -This allows us to do experiments, such as examining how consumption patterns change as prices or -wealth levels change. - -```{code-cell} python -# Create various prices -n_pa = 50 -prices_A = np.linspace(0.5, 5.0, n_pa) -W = 20 - -# Create lists to store the results of the optimal A and B calculation -optimal_As = [] -optimal_Bs = [] -for pa in prices_A: - result = minimize_scalar(objective, args=(W, pa)) - opt_B_val = result.x - - optimal_Bs.append(opt_B_val) - optimal_As.append(A_bc(opt_B_val, W, pa)) - -fig, ax = plt.subplots() - -ax.plot(prices_A, optimal_As, label="Purchased Apples") -ax.plot(prices_A, optimal_Bs, label="Purchased Bananas") -ax.set_xlabel("Price of Apples") -ax.legend() -``` - -````{admonition} Exercise -:name: dir3-5-1 - -See exercise 1 in the {ref}`exercise list `. -```` - -#### Satiation Point - -The above example is a particular utility function where consumers prefer to "eat" as much as -possible of every good available, but that may not be the case for all preferences. - -When an optimum exists for the unconstrained problem (e.g. with an infinite budget), it is called a -bliss point, or satiation. - -Instead of bananas and apples, consider a utility function for potato chips (`P`) and chocolate -bars (`C`). - -$$ -U(P, C) = -(P - 20)^2 - 2 * (C - 1)^2 -$$ - -To numerically calculate the maximum (which you can probably see through inspection), one must directly solve the constrained maximization problem. - - -````{admonition} Exercise -:name: dir3-5-2 - -See exercise 2 in the {ref}`exercise list `. -```` - -(ex3-5)= -## Exercises - -### Exercise 1 - -Try solving the constrained maximization problem by hand via the Lagrangian method. - -Is it surprising that the demand for bananas is unaffected by the change in apple prices? - -Why might this be? - -({ref}`back to text `) - -### Exercise 2 - -Using a similar approach to that of the apples/bananas example above, solve for the optimal -basket of potato chips and chocolate bars when `W = 10`, `p_P = 1`, and `p_C = 2`. - -```{code-cell} python -W = 10 -p_P = 1 -p_C = 2 - -# Your code here -``` - -What is the optimal basket if we expand the budget constraint to have `W = 50`? - -```{code-cell} python -# Your code here -``` - -What is the optimal basket if we expand the budget constraint to have `W = 150`? - -```{code-cell} python -# Your code here -``` - -```{hint} -You can no longer assume that the `A_bc` function is always binding, as we did before, and will need to check results more carefully. - -While not required, you can take this opportunity to play around with other scipy functions such as Scipy optimize . -``` - -({ref}`back to text `) +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Optimization + +**Prerequisites** + +- {doc}`Introduction to Numpy ` +- {doc}`Applied Linear Algebra ` + +**Outcomes** + +- Perform optimization by hand using derivatives +- Understand ideas from gradient descent + + +```{literalinclude} ../_static/colab_light.raw +``` + +```{code-cell} python +# imports for later +import numpy as np +import matplotlib.pyplot as plt +%matplotlib inline +``` + +## What is Optimization? + +Optimization is the branch of mathematics focused on finding extreme values (max or min) of +functions. + +Optimization tools will appear in many places throughout this course, including: + +- Building economic models in which individuals make decisions that maximize their utility. +- Building statistical models and maximizing the fit of these models by optimizing certain fit + functions. + +In this lecture, we will focus mostly on the first to limit the moving pieces, but in other lectures, we'll discuss the second in detail. + +### Derivatives and Optima + +Here, we revisit some of the theory that you have already learned in your calculus class. + +Consider function $f(x)$ which maps a number into another number. We can say that any point +where $f'(x) = 0$ is a local extremum of $f$. + +Let's work through an example. Consider the function + +$$ +f(x) = x^4 - 3 x^2 +$$ + +Its derivative is given by + +$$ +\frac{\partial f}{\partial x} = 4 x^3 - 6 x +$$ + +Let's plot the function and its derivative to pick out the local extremum by hand. + +```{code-cell} python +def f(x): + return x**4 - 3*x**2 + + +def fp(x): + return 4*x**3 - 6*x + +# Create 100 evenly spaced points between -2 and 2 +x = np.linspace(-2., 2., 100) + +# Evaluate the functions at x values +fx = f(x) +fpx = fp(x) + +# Create plot +fig, ax = plt.subplots(1, 2) + +ax[0].plot(x, fx) +ax[0].set_title("Function") + +ax[1].plot(x, fpx) +ax[1].hlines(0.0, -2.5, 2.5, color="k", linestyle="--") +ax[1].set_title("Derivative") + +for _ax in ax: + _ax.spines["right"].set_visible(False) + _ax.spines["top"].set_visible(False) +``` + +If you stare at this picture, you can probably determine the the local maximum is at +$x = 0$ and the local minima at $x \approx -1$ and $x \approx 1$. + +To properly determine the minima and maxima, we find the solutions to $f'(x) = 0$ below: + +$$ +f'(x) = 4 x^3 - 6 x = 0 +$$ + +$$ +\rightarrow x = \left\{0, \frac{\sqrt{6}}{2}, \frac{-\sqrt{6}}{2} \right\} +$$ + +Let's check whether we can get the same answers with Python! To do this, we import a new +package that we haven't seen yet. + +```{code-cell} python +import scipy.optimize as opt +``` + +Then using the function definitions from earlier, we search for the minimum and maximum values. + +```{code-cell} python +# For a scalar problem, we give it the function and the bounds between +# which we want to search +neg_min = opt.minimize_scalar(f, [-2, -0.5]) +pos_min = opt.minimize_scalar(f, [0.5, 2.0]) +print("The negative minimum is: \n", neg_min) +print("The positive minimum is: \n", pos_min) +``` + +The scipy optimize package only has functions that find minimums... You might be wondering, then, how we +will verify our maximum value. + +It turns out that finding the maximum is equivalent to simply finding the minimum of the negative function. + +```{code-cell} python +# Create a function that evaluates to negative f +def neg_f(x): + return -f(x) + +max_out = opt.minimize_scalar(neg_f, [-0.35, 0.35]) +print("The maximum is: \n", max_out) +``` + +We won't dive into the details of optimization algorithms in this lecture, but we'll impart some brief +intuition to help you understand the types of problems these algorithms are good at solving and +the types of problems they will struggle with: + +The general intuition is that when you're finding a maximum, an algorithm takes a step +in the direction of the derivative... (Conversely, to find a minimum, the algorithm takes a step opposite the direction of the derivative.) +This requires the function to be relatively smooth and continuous. The algorithm also has an easier time if there is only one (or very few) extremum to be found... + +For minimization, you can imagine the algorithm as a marble in a bowl. + +The marble will keep rolling down the slope of the bowl until it finds the bottom. + +It may overshoot, but once it hits the slope on the other side, it will continue to roll back +and forth until it comes to rest. + +Thus, when deciding whether numerical optimization is an effective method for a +particular problem, you could try visualizing the function to determine whether a marble +would be able to come to rest at the extreme values you are looking for. + +### Application: Consumer Theory + +A common use of maximization in economics is to model +optimal consumption decisions . + +#### Preferences and Utility Functions + +To summarize introductory economics, take a set of +[preferences](https://en.wikipedia.org/wiki/Preference_%28economics%29) of consumers over "bundles" +of goods (e.g. 2 apples and 3 oranges is preferred to 3 apples and 2 oranges, or a 100% chance to +win $1$ dollar is preferred to a 50% chance to win $2.10$ dollars). + +Under certain assumptions, you rationalize the preferences as a utility function over the different +goods (always remembering that the utility is simply a tool to order preferences and the numbers are +usually not meaningful themselves). + +For example, consider a utility function over bundles of bananas (B) and apples (A) + +$$ +U(B, A) = B^{\alpha}A^{1-\alpha} +$$ + +Where $\alpha \in [0,1]$. + +First, let's take a look at this particular utility function. + +```{code-cell} python +def U(A, B, alpha=1/3): + return B**alpha * A**(1-alpha) + +fig, ax = plt.subplots() +B = 1.5 +A = np.linspace(1, 10, 100) +ax.plot(A, U(A, B)) +ax.set_xlabel("A") +ax.set_ylabel("U(B=1.5, A)") +``` + +We note that + +- $U(B,1)$ is always higher with more B, hence, consuming more bananas has a +: positive marginal utility i.e. $\frac{d U(B,1)}{d B} > 0$. +- The more bananas we consume, the smaller the change in marginal utility, i.e. + $\frac{d^2 U(B,1)}{d B^2} < 0$. + +If we plot both the $B$ and the $A$, we can see how the utility changes with different +bundles. + +```{code-cell} python +fig, ax = plt.subplots() +B = np.linspace(1, 20, 100).reshape((100, 1)) +contours = ax.contourf(A, B.flatten(), U(A, B)) +fig.colorbar(contours) +ax.set_xlabel("A") +ax.set_ylabel("B") +ax.set_title("U(A,B)") +``` + +We can find the bundles between which the consumer would be indifferent by fixing a +utility $\bar{U}$ and by determining all combinations of $A$ and $B$ where +$\bar{U} = U(B, A)$. + +In this example, we can implement this calculation by letting $B$ be the variable on the +x-axis and solving for $A(\bar{U}, B)$ + +$$ +A(B, \bar{U}) = U^{\frac{1}{1-\alpha}}B^{\frac{-\alpha}{1-\alpha}} +$$ + +```{code-cell} python +def A_indifference(B, ubar, alpha=1/3): + return ubar**(1/(1-alpha)) * B**(-alpha/(1-alpha)) + +def plot_indifference_curves(ax, alpha=1/3): + ubar = np.arange(1, 11, 2) + ax.plot(B, A_indifference(B, ubar, alpha)) + ax.legend([r"$\bar{U}$" + " = {}".format(i) for i in ubar]) + ax.set_xlabel("B") + ax.set_ylabel(r"$A(B, \bar{U}$)") + +fig, ax = plt.subplots() +plot_indifference_curves(ax) +``` + +Note that in every case, if you increase either the number of apples or bananas (holding the other +fixed), you reach a higher indifference curve. + +Consequently, in a world without scarcity or budgets, consumers would consume +an arbitrarily high number of both to maximize their utility. + +#### Budget Constraints + +While the above example plots consumer preferences, it says nothing about what the consumers can afford. + +The simplest sort of constraint is a budget constraint where bananas and apples both have a price +and the consumer has a limited amount of funds. + +If the prices per banana and per apple are identical, no matter how many you consume, then the +affordable bundles are simply all pairs of apples and bananas below the line. +$p_a A + p_b B \leq W$. + +For example, if consumer has a budget of $W$, the price of apples is $p_A = 2$ dollars per +apple, and the price of bananas is normalized to be $p_B = 1$ dollar per banana, then the consumer +can afford anything below the line. + +$$ +2 A + B \leq W +$$ + +Or, letting $W = 20$ and plotting + +```{code-cell} python +def A_bc(B, W=20, pa=2): + "Given B, W, and pa return the max amount of A our consumer can afford" + return (W - B) / pa + +def plot_budget_constraint(ax, W=20, pa=2): + B_bc = np.array([0, W]) + A = A_bc(B_bc, W, pa) + ax.plot(B_bc, A) + ax.fill_between(B_bc, 0, A, alpha=0.2) + ax.set_xlabel("B") + ax.set_ylabel("A") + return ax + +fig, ax = plt.subplots() +plot_budget_constraint(ax, 20, 2) +``` + +While the consumer can afford any of the bundles in that area, most will not be optimal. + +#### Optimal Choice + +Putting the budget constraints and the utility functions together lets us visualize the optimal +decision of a consumer. Choose the bundle with the highest possible indifference curve within its +budget set. + +```{code-cell} python +fig, ax = plt.subplots() +plot_indifference_curves(ax) +plot_budget_constraint(ax) +``` + +We have several ways to find the particular point $A, B$ of maximum utility, such as +finding the point where the indifference curve and the budget constraint have the same slope, but a +simple approach is to just solve the direct maximization problem. + +$$ +\begin{aligned} +\max_{A, B} & B^{\alpha}A^{1-\alpha}\\ +\text{s.t. } & p_A A + B \leq W +\end{aligned} +$$ + +Solving this problem directly requires solving a multi-dimensional constrained optimization problem, +where scipy +has several options. + +For this particular problem, we notice two things: (1) The utility function is increasing in both +$A$ and $B$, and (2) there are only 2 goods. + +This allows us 1) to assume that the budget constraint holds at equality, $p_a A + B = W$, 2) to +form a new function $A(B) = (W - B) / p_a$ by rearranging the budget constraint at equality, and +3) to substitute that function directly to form: + +$$ +\max_{B} B^{\alpha}A(B)^{1-\alpha} +$$ + +Compared to before, this problem has been turned into an unconstrained univariate optimization +problem. + +To implement this in code, notice that the $A(B)$ function is what we defined before +as `A_bc`. + +We will solve this by using the function `scipy.optimize.minimize_scalar`, which takes a function +`f(x)` and returns the value of `x` that minimizes `f`. + +```{code-cell} python +from scipy.optimize import minimize_scalar + +def objective(B, W=20, pa=2): + """ + Return value of -U for a given B, when we consume as much A as possible + + Note that we return -U because scipy wants to minimize functions, + and the value of B that minimizes -U will maximize U + """ + A = A_bc(B, W, pa) + return -U(A, B) + +result = minimize_scalar(objective) +optimal_B = result.x +optimal_A = A_bc(optimal_B, 20, 2) +optimal_U = U(optimal_A, optimal_B) + +print("The optimal U is ", optimal_U) +print("and was found at (A,B) =", (optimal_A, optimal_B)) +``` + +This allows us to do experiments, such as examining how consumption patterns change as prices or +wealth levels change. + +```{code-cell} python +# Create various prices +n_pa = 50 +prices_A = np.linspace(0.5, 5.0, n_pa) +W = 20 + +# Create lists to store the results of the optimal A and B calculation +optimal_As = [] +optimal_Bs = [] +for pa in prices_A: + result = minimize_scalar(objective, args=(W, pa)) + opt_B_val = result.x + + optimal_Bs.append(opt_B_val) + optimal_As.append(A_bc(opt_B_val, W, pa)) + +fig, ax = plt.subplots() + +ax.plot(prices_A, optimal_As, label="Purchased Apples") +ax.plot(prices_A, optimal_Bs, label="Purchased Bananas") +ax.set_xlabel("Price of Apples") +ax.legend() +``` + +````{admonition} Exercise +:name: dir3-5-1 + +See exercise 1 in the {ref}`exercise list `. +```` + +#### Satiation Point + +The above example is a particular utility function where consumers prefer to "eat" as much as +possible of every good available, but that may not be the case for all preferences. + +When an optimum exists for the unconstrained problem (e.g. with an infinite budget), it is called a +bliss point, or satiation. + +Instead of bananas and apples, consider a utility function for potato chips (`P`) and chocolate +bars (`C`). + +$$ +U(P, C) = -(P - 20)^2 - 2 * (C - 1)^2 +$$ + +To numerically calculate the maximum (which you can probably see through inspection), one must directly solve the constrained maximization problem. + + +````{admonition} Exercise +:name: dir3-5-2 + +See exercise 2 in the {ref}`exercise list `. +```` + +(ex3-5)= +## Exercises + +### Exercise 1 + +Try solving the constrained maximization problem by hand via the Lagrangian method. + +Is it surprising that the demand for bananas is unaffected by the change in apple prices? + +Why might this be? + +({ref}`back to text `) + +### Exercise 2 + +Using a similar approach to that of the apples/bananas example above, solve for the optimal +basket of potato chips and chocolate bars when `W = 10`, `p_P = 1`, and `p_C = 2`. + +```{code-cell} python +W = 10 +p_P = 1 +p_C = 2 + +# Your code here +``` + +What is the optimal basket if we expand the budget constraint to have `W = 50`? + +```{code-cell} python +# Your code here +``` + +What is the optimal basket if we expand the budget constraint to have `W = 150`? + +```{code-cell} python +# Your code here +``` + +```{hint} +You can no longer assume that the `A_bc` function is always binding, as we did before, and will need to check results more carefully. + +While not required, you can take this opportunity to play around with other scipy functions such as Scipy optimize . +``` + +({ref}`back to text `) diff --git a/lectures/scientific/plotting.md b/lectures/scientific/plotting.md index f20c0d57..7816ed0b 100644 --- a/lectures/scientific/plotting.md +++ b/lectures/scientific/plotting.md @@ -1,206 +1,206 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Plotting - -**Prerequisites** - -- {doc}`Introduction to Numpy ` - -**Outcomes** - -- Understand components of matplotlib plots -- Make basic plots - - -```{literalinclude} ../_static/colab_light.raw -``` - -## Visualization - -One of the most important outputs of your analysis will be the visualizations that you choose to -communicate what you've discovered. - -Here are what some people -- whom we think have earned the right to an opinion on this -material -- have said with respect to data visualizations. - -> I spend hours thinking about how to get the story across in my visualizations. I don't mind taking that long because it's that five minutes of presenting it or someone getting it that can make or break a deal -- Goldman Sachs executive - - - - - -We won't have time to cover "how to make a compelling data visualization" in this lecture. - -Instead, we will focus on the basics of creating visualizations in Python. - -This will be a fast introduction, but this material appears in almost every -lecture going forward, which will help the concepts sink in. - -In almost any profession that you pursue, much of what you do involves communicating ideas to others. - -Data visualization can help you communicate these ideas effectively, and we encourage you to learn -more about what makes a useful visualization. - -We include some references that we have found useful below. - -* [The Functional Art: An introduction to information graphics and visualization](https://www.amazon.com/The-Functional-Art-introduction-visualization/dp/0321834739/) by Alberto Cairo -* [The Visual Display of Quantitative Information](https://www.amazon.com/Visual-Display-Quantitative-Information/dp/1930824130) by Edward Tufte -* [The Wall Street Journal Guide to Information Graphics: The Dos and Don'ts of Presenting Data, Facts, and Figures](https://www.amazon.com/Street-Journal-Guide-Information-Graphics/dp/0393347281) by Dona M Wong -* [Introduction to Data Visualization](http://paldhous.github.io/ucb/2016/dataviz/index.html) - -## `matplotlib` - -The most widely used plotting package in Python is matplotlib. - -The standard import alias is - -```{code-cell} python -import matplotlib.pyplot as plt -import numpy as np -``` - -Note above that we are using `matplotlib.pyplot` rather than just `matplotlib`. - -`pyplot` is a sub-module found in some large packages to further organize functions and types. We are able to give the `plt` alias to this sub-module. - -Additionally, when we are working in the notebook, we need tell matplotlib to display our images -inside of the notebook itself instead of creating new windows with the image. - -This is done by - -```{code-cell} python -%matplotlib inline -``` - -The commands with `%` before them are called [Magics](https://ipython.readthedocs.io/en/stable/interactive/magics.html). - -### First Plot - -Let's create our first plot! - -After creating it, we will walk through the steps one-by-one to understand what they do. - -```{code-cell} python -# Step 1 -fig, ax = plt.subplots() - -# Step 2 -x = np.linspace(0, 2*np.pi, 100) -y = np.sin(x) - -# Step 3 -ax.plot(x, y) -``` - -1. Create a figure and axis object which stores the information from our graph. -1. Generate data that we will plot. -1. Use the `x` and `y` data, and make a line plot on our axis, `ax`, by calling the `plot` method. - -### Difference between Figure and Axis - -We've found that the easiest way for us to distinguish between the figure and axis objects is to -think about them as a framed painting. - -The axis is the canvas; it is where we "draw" our plots. - -The figure is the entire framed painting (which inclues the axis itself!). - -We can also see this by setting certain elements of the figure to different colors. - -```{code-cell} python -fig, ax = plt.subplots() - -fig.set_facecolor("red") -ax.set_facecolor("blue") -``` - -This difference also means that you can place more than one axis on a figure. - -```{code-cell} python -# We specified the shape of the axes -- It means we will have two rows and three columns -# of axes on our figure -fig, axes = plt.subplots(2, 3) - -fig.set_facecolor("gray") - -# Can choose hex colors -colors = ["#065535", "#89ecda", "#ffd1dc", "#ff0000", "#6897bb", "#9400d3"] - -# axes is a numpy array and we want to iterate over a flat version of it -for (ax, c) in zip(axes.flat, colors): - ax.set_facecolor(c) - -fig.tight_layout() -``` - -### Functionality - -The matplotlib library is versatile and very flexible. - -You can see various examples of what it can do on the -[matplotlib example gallery](https://matplotlib.org/gallery.html). - -We work though a few examples to quickly introduce some possibilities. - -**Bar** - -```{code-cell} python -countries = ["CAN", "MEX", "USA"] -populations = [36.7, 129.2, 325.700] -land_area = [3.850, 0.761, 3.790] - -fig, ax = plt.subplots(2) - -ax[0].bar(countries, populations, align="center") -ax[0].set_title("Populations (in millions)") - -ax[1].bar(countries, land_area, align="center") -ax[1].set_title("Land area (in millions miles squared)") - -fig.tight_layout() -``` - -**Scatter and annotation** - -```{code-cell} python -N = 50 - -np.random.seed(42) - -x = np.random.rand(N) -y = np.random.rand(N) -colors = np.random.rand(N) -area = np.pi * (15 * np.random.rand(N))**2 # 0 to 15 point radii - -fig, ax = plt.subplots() - -ax.scatter(x, y, s=area, c=colors, alpha=0.5) - -ax.annotate( - "First point", xy=(x[0], y[0]), xycoords="data", - xytext=(25, -25), textcoords="offset points", - arrowprops=dict(arrowstyle="->", connectionstyle="arc3,rad=0.6") -) -``` - -**Fill between** - -```{code-cell} python -x = np.linspace(0, 1, 500) -y = np.sin(4 * np.pi * x) * np.exp(-5 * x) - -fig, ax = plt.subplots() - -ax.grid(True) -ax.fill(x, y) -``` - +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Plotting + +**Prerequisites** + +- {doc}`Introduction to Numpy ` + +**Outcomes** + +- Understand components of matplotlib plots +- Make basic plots + + +```{literalinclude} ../_static/colab_light.raw +``` + +## Visualization + +One of the most important outputs of your analysis will be the visualizations that you choose to +communicate what you've discovered. + +Here are what some people -- whom we think have earned the right to an opinion on this +material -- have said with respect to data visualizations. + +> I spend hours thinking about how to get the story across in my visualizations. I don't mind taking that long because it's that five minutes of presenting it or someone getting it that can make or break a deal -- Goldman Sachs executive + + + + + +We won't have time to cover "how to make a compelling data visualization" in this lecture. + +Instead, we will focus on the basics of creating visualizations in Python. + +This will be a fast introduction, but this material appears in almost every +lecture going forward, which will help the concepts sink in. + +In almost any profession that you pursue, much of what you do involves communicating ideas to others. + +Data visualization can help you communicate these ideas effectively, and we encourage you to learn +more about what makes a useful visualization. + +We include some references that we have found useful below. + +* [The Functional Art: An introduction to information graphics and visualization](https://www.amazon.com/The-Functional-Art-introduction-visualization/dp/0321834739/) by Alberto Cairo +* [The Visual Display of Quantitative Information](https://www.amazon.com/Visual-Display-Quantitative-Information/dp/1930824130) by Edward Tufte +* [The Wall Street Journal Guide to Information Graphics: The Dos and Don'ts of Presenting Data, Facts, and Figures](https://www.amazon.com/Street-Journal-Guide-Information-Graphics/dp/0393347281) by Dona M Wong +* [Introduction to Data Visualization](http://paldhous.github.io/ucb/2016/dataviz/index.html) + +## `matplotlib` + +The most widely used plotting package in Python is matplotlib. + +The standard import alias is + +```{code-cell} python +import matplotlib.pyplot as plt +import numpy as np +``` + +Note above that we are using `matplotlib.pyplot` rather than just `matplotlib`. + +`pyplot` is a sub-module found in some large packages to further organize functions and types. We are able to give the `plt` alias to this sub-module. + +Additionally, when we are working in the notebook, we need tell matplotlib to display our images +inside of the notebook itself instead of creating new windows with the image. + +This is done by + +```{code-cell} python +%matplotlib inline +``` + +The commands with `%` before them are called [Magics](https://ipython.readthedocs.io/en/stable/interactive/magics.html). + +### First Plot + +Let's create our first plot! + +After creating it, we will walk through the steps one-by-one to understand what they do. + +```{code-cell} python +# Step 1 +fig, ax = plt.subplots() + +# Step 2 +x = np.linspace(0, 2*np.pi, 100) +y = np.sin(x) + +# Step 3 +ax.plot(x, y) +``` + +1. Create a figure and axis object which stores the information from our graph. +1. Generate data that we will plot. +1. Use the `x` and `y` data, and make a line plot on our axis, `ax`, by calling the `plot` method. + +### Difference between Figure and Axis + +We've found that the easiest way for us to distinguish between the figure and axis objects is to +think about them as a framed painting. + +The axis is the canvas; it is where we "draw" our plots. + +The figure is the entire framed painting (which inclues the axis itself!). + +We can also see this by setting certain elements of the figure to different colors. + +```{code-cell} python +fig, ax = plt.subplots() + +fig.set_facecolor("red") +ax.set_facecolor("blue") +``` + +This difference also means that you can place more than one axis on a figure. + +```{code-cell} python +# We specified the shape of the axes -- It means we will have two rows and three columns +# of axes on our figure +fig, axes = plt.subplots(2, 3) + +fig.set_facecolor("gray") + +# Can choose hex colors +colors = ["#065535", "#89ecda", "#ffd1dc", "#ff0000", "#6897bb", "#9400d3"] + +# axes is a numpy array and we want to iterate over a flat version of it +for (ax, c) in zip(axes.flat, colors): + ax.set_facecolor(c) + +fig.tight_layout() +``` + +### Functionality + +The matplotlib library is versatile and very flexible. + +You can see various examples of what it can do on the +[matplotlib example gallery](https://matplotlib.org/gallery.html). + +We work though a few examples to quickly introduce some possibilities. + +**Bar** + +```{code-cell} python +countries = ["CAN", "MEX", "USA"] +populations = [36.7, 129.2, 325.700] +land_area = [3.850, 0.761, 3.790] + +fig, ax = plt.subplots(2) + +ax[0].bar(countries, populations, align="center") +ax[0].set_title("Populations (in millions)") + +ax[1].bar(countries, land_area, align="center") +ax[1].set_title("Land area (in millions miles squared)") + +fig.tight_layout() +``` + +**Scatter and annotation** + +```{code-cell} python +N = 50 + +np.random.seed(42) + +x = np.random.rand(N) +y = np.random.rand(N) +colors = np.random.rand(N) +area = np.pi * (15 * np.random.rand(N))**2 # 0 to 15 point radii + +fig, ax = plt.subplots() + +ax.scatter(x, y, s=area, c=colors, alpha=0.5) + +ax.annotate( + "First point", xy=(x[0], y[0]), xycoords="data", + xytext=(25, -25), textcoords="offset points", + arrowprops=dict(arrowstyle="->", connectionstyle="arc3,rad=0.6") +) +``` + +**Fill between** + +```{code-cell} python +x = np.linspace(0, 1, 500) +y = np.sin(4 * np.pi * x) * np.exp(-5 * x) + +fig, ax = plt.subplots() + +ax.grid(True) +ax.fill(x, y) +``` + diff --git a/lectures/scientific/randomness.md b/lectures/scientific/randomness.md index 419375da..2be87018 100644 --- a/lectures/scientific/randomness.md +++ b/lectures/scientific/randomness.md @@ -1,641 +1,641 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Randomness - -**Prerequisites** - -- {doc}`Introduction to Numpy ` -- {doc}`Applied Linear Algebra ` - -**Outcomes** - -- Recall basic probability -- Draw random numbers from numpy -- Understand why simulation is useful -- Understand the basics of Markov chains and using the `quantecon` library to study them -- Simulate discrete and continuous random variables and processes - - -```{literalinclude} ../_static/colab_light.raw -``` - -## Randomness - -We will use the `numpy.random` package to simulate randomness in Python. - -This lecture will present various probability distributions and then use -numpy.random to numerically verify some of the facts associated with them. - -We import `numpy` as usual - -```{code-cell} python -import numpy as np -import matplotlib.pyplot as plt -%matplotlib inline -``` - -### Probability - -Before we learn how to use Python to generate randomness, we should make sure -that we all agree on some basic concepts of probability. - -To think about the probability of some event occurring, we must understand what possible -events could occur -- mathematicians refer to this as the *event space*. - -Some examples are - -* For a coin flip, the coin could either come up heads, tails, or land on its side. -* The inches of rain falling in a certain location on a given day could be any real - number between 0 and $\infty$. -* The change in an S&P500 stock price could be any real number between - $-$ opening price and $\infty$. -* An individual's employment status tomorrow could either be employed or unemployed. -* And the list goes on... - -Notice that in some of these cases, the event space can be counted (coin flip and employment status) -while in others, the event space cannot be counted (rain and stock prices). - -We refer to random variables with countable event spaces as *discrete random variables* and -random variables with uncountable event spaces as *continuous random variables*. - -We then call certain numbers 'probabilities' and associate them with events from the event space. - -The following is true about probabilities. - -1. The probability of any event must be greater than or equal to 0. -1. The probability of all events from the event space must sum (or integrate) to 1. -1. If two events cannot occur at same time, then the probability that at least one of them occurs is - the sum of the probabilities that each event occurs (known as independence). - -We won't rely on these for much of what we learn in this class, but occasionally, these facts will -help us reason through what is happening. - -### Simulating Randomness in Python - -One of the most basic random numbers is a variable that has equal probability of being any value -between 0 and 1. - -You may have previously learned about this probability distribution as the Uniform(0, 1). - -Let's dive into generating some random numbers. - -Run the code below multiple times and see what numbers you get. - -```{code-cell} python -np.random.rand() -``` - -We can also generate arrays of random numbers. - -```{code-cell} python -np.random.rand(25) -``` - -```{code-cell} python -np.random.rand(5, 5) -``` - -```{code-cell} python -np.random.rand(2, 3, 4) -``` - -### Why Do We Need Randomness? - -As economists and data scientists, we study complex systems. - -These systems have inherent randomness, but they do not readily reveal their underlying distribution -to us. - -In cases where we face this difficulty, we turn to a set of tools known as Monte Carlo -methods. - -These methods effectively boil down to repeatedly simulating some event (or events) and looking at -the outcome distribution. - -This tool is used to inform decisions in search and rescue missions, election predictions, sports, -and even by the Federal Reserve. - -The reasons that Monte Carlo methods work is a mathematical theorem known as the *Law of Large -Numbers*. - -The Law of Large Numbers basically says that under relatively general conditions, the distribution of simulated outcomes will mimic the true distribution as the number of simulated events goes to infinity. - -We already know how the uniform distribution looks, so let's demonstrate the Law of Large Numbers by approximating the uniform distribution. - -```{code-cell} python -# Draw various numbers of uniform[0, 1] random variables -draws_10 = np.random.rand(10) -draws_200 = np.random.rand(200) -draws_10000 = np.random.rand(10_000) - -# Plot their histograms -fig, ax = plt.subplots(3) - -ax[0].set_title("Histogram with 10 draws") -ax[0].hist(draws_10) - -ax[1].set_title("Histogram with 200 draws") -ax[1].hist(draws_200) - -ax[2].set_title("Histogram with 10,000 draws") -ax[2].hist(draws_10000) - -fig.tight_layout() -``` - - -````{admonition} Exercise -:name: dir3-4-1 - -See exercise 1 in the {ref}`exercise list `. -```` - - -### Discrete Distributions - -Sometimes we will encounter variables that can only take one of a -few possible values. - -We refer to this type of random variable as a discrete distribution. - -For example, consider a small business loan company. - -Imagine that the company's loan requires a repayment of $\$25,000$ and must be repaid 1 year -after the loan was made. - -The company discounts the future at 5%. - -Additionally, the loans made are repaid in full with 75% probability, while -$\$12,500$ of loans is repaid with probability 20%, and no repayment with 5% -probability. - -How much would the small business loan company be willing to loan if they'd like to --- on average -- break even? - -In this case, we can compute this by hand: - -The amount repaid, on average, is: $0.75(25,000) + 0.2(12,500) + 0.05(0) = 21,250$. - -Since we'll receive that amount in one year, we have to discount it: -$\frac{1}{1+0.05} 21,250 \approx 20238$. - -We can now verify by simulating the outcomes of many loans. - -```{code-cell} python -# You'll see why we call it `_slow` soon :) -def simulate_loan_repayments_slow(N, r=0.05, repayment_full=25_000.0, - repayment_part=12_500.0): - repayment_sims = np.zeros(N) - for i in range(N): - x = np.random.rand() # Draw a random number - - # Full repayment 75% of time - if x < 0.75: - repaid = repayment_full - elif x < 0.95: - repaid = repayment_part - else: - repaid = 0.0 - - repayment_sims[i] = (1 / (1 + r)) * repaid - - return repayment_sims - -print(np.mean(simulate_loan_repayments_slow(25_000))) -``` - -#### Aside: Vectorized Computations - -The code above illustrates the concepts we were discussing but is much slower than -necessary. - -Below is a version of our function that uses numpy arrays to perform computations -instead of only storing the values. - -```{code-cell} python -def simulate_loan_repayments(N, r=0.05, repayment_full=25_000.0, - repayment_part=12_500.0): - """ - Simulate present value of N loans given values for discount rate and - repayment values - """ - random_numbers = np.random.rand(N) - - # start as 0 -- no repayment - repayment_sims = np.zeros(N) - - # adjust for full and partial repayment - partial = random_numbers <= 0.20 - repayment_sims[partial] = repayment_part - - full = ~partial & (random_numbers <= 0.95) - repayment_sims[full] = repayment_full - - repayment_sims = (1 / (1 + r)) * repayment_sims - - return repayment_sims - -np.mean(simulate_loan_repayments(25_000)) -``` - -We'll quickly demonstrate the time difference in running both function versions. - -```{code-cell} python -%timeit simulate_loan_repayments_slow(250_000) -``` - -```{code-cell} python -%timeit simulate_loan_repayments(250_000) -``` - -The timings for my computer were 167 ms for `simulate_loan_repayments_slow` and 5.05 ms for -`simulate_loan_repayments`. - -This function is simple enough that both times are acceptable, but the 33x time difference could -matter in a more complicated operation. - -This illustrates a concept called *vectorization*, which is when computations -operate on an entire array at a time. - -In general, numpy code that is *vectorized* will perform better than numpy code that operates on one -element at a time. - -For more information see the -[QuantEcon lecture on performance Python](https://python-programming.quantecon.org/numba.html) code. - -#### Profitability Threshold - -Rather than looking for the break even point, we might be interested in the largest loan size that -ensures we still have a 95% probability of profitability in a year we make 250 loans. - -This is something that could be computed by hand, but it is much easier to answer through -simulation! - -If we simulate 250 loans many times and keep track of what the outcomes look like, then we can look -at the the 5th percentile of total repayment to find the loan size needed for 95% probability of -being profitable. - -```{code-cell} python -def simulate_year_of_loans(N=250, K=1000): - - # Create array where we store the values - avg_repayments = np.zeros(K) - for year in range(K): - - repaid_year = 0.0 - n_loans = simulate_loan_repayments(N) - avg_repayments[year] = n_loans.mean() - - return avg_repayments - -loan_repayment_outcomes = simulate_year_of_loans(N=250) - -# Think about why we use the 5th percentile of outcomes to -# compute when we are profitable 95% of time -lro_5 = np.percentile(loan_repayment_outcomes, 5) - -print("The largest loan size such that we were profitable 95% of time is") -print(lro_5) -``` - -Now let's consider what we could learn if our loan company had even more detailed information about -how the life of their loans progressed. - -#### Loan States - -Loans can have 3 potential statuses (or states): - -1. Repaying: Payments are being made on loan. -1. Delinquency: No payments are currently being made, but they might be made in the future. -1. Default: No payments are currently being made and no more payments will be made in future. - -The small business loans company knows the following: - -* If a loan is currently in repayment, then it has an 85% probability of continuing being repaid, a - 10% probability of going into delinquency, and a 5% probability of going into default. -* If a loan is currently in delinquency, then it has a 25% probability of returning to repayment, a - 60% probability of staying delinquent, and a 15% probability of going into default. -* If a loan is currently in default, then it remains in default with 100% probability. - -For simplicity, let's imagine that 12 payments are made during the life of a loan, even though -this means people who experience delinquency won't be required to repay their remaining balance. - -Let's write the code required to perform this dynamic simulation. - -```{code-cell} python -def simulate_loan_lifetime(monthly_payment): - - # Create arrays to store outputs - payments = np.zeros(12) - # Note: dtype 'U12' means a string with no more than 12 characters - statuses = np.array(4*["repaying", "delinquency", "default"], dtype="U12") - - # Everyone is repaying during their first month - payments[0] = monthly_payment - statuses[0] = "repaying" - - for month in range(1, 12): - rn = np.random.rand() - - if (statuses[month-1] == "repaying"): - if rn < 0.85: - payments[month] = monthly_payment - statuses[month] = "repaying" - elif rn < 0.95: - payments[month] = 0.0 - statuses[month] = "delinquency" - else: - payments[month] = 0.0 - statuses[month] = "default" - elif (statuses[month-1] == "delinquency"): - if rn < 0.25: - payments[month] = monthly_payment - statuses[month] = "repaying" - elif rn < 0.85: - payments[month] = 0.0 - statuses[month] = "delinquency" - else: - payments[month] = 0.0 - statuses[month] = "default" - else: # Default -- Stays in default after it gets there - payments[month] = 0.0 - statuses[month] = "default" - - return payments, statuses -``` - -We can use this model of the world to answer even more questions than the last model! - -For example, we can think about things like - -* For the defaulted loans, how many payments did they make before going into default? -* For those who partially repaid, how much was repaid before the 12 months was over? - -Unbeknownst to you, we have just introduced a well-known mathematical concept known as a Markov -chain. - -A Markov chain is a random process (Note: Random process is a sequence of random variables -observed over time) where the probability of something happening tomorrow only depends on what we -can observe today. - -In our small business loan example, this just means that the small business loan's repayment status -tomorrow only depended on what its repayment status was today. - -Markov chains often show up in economics and statistics, so we decided a simple introduction would -be helpful, but we leave out many details for the interested reader to find. - -A Markov chain is defined by three objects: - -1. A description of the possible states and their associated value. -1. A complete description of the probability of moving from one state to all other states. -1. An initial distribution over the states (often a vector of all zeros except for a single 1 for - some particular state). - -For the example above, we'll define each of these three things in the Python code below. - -```{code-cell} python -# 1. State description -state_values = ["repaying", "delinquency", "default"] - -# 2. Transition probabilities: encoded in a matrix (2d-array) where element [i, j] -# is the probability of moving from state i to state j -P = np.array([[0.85, 0.1, 0.05], [0.25, 0.6, 0.15], [0, 0, 1]]) - -# 3. Initial distribution: assume loans start in repayment -x0 = np.array([1, 0, 0]) -``` - -Now that we have these objects defined, we can use the a `MarkovChain` class from the -[quantecon python library](https://github.com/QuantEcon/QuantEcon.py/) to analyze this model. - -```{code-cell} python -import quantecon as qe - -mc = qe.markov.MarkovChain(P, state_values) -``` - -We can use the `mc` object to do common Markov chain operations. - -The `simulate` method will simulate the Markov chain for a specified number of steps: - -```{code-cell} python -mc.simulate(12, init="repaying") -``` - -Suppose we were to simulate the Markov chain for an infinite number of steps. - -Given the random nature of transitions, we might end up taking different paths at any given moment. - -We can summarize all possible paths over time by keeping track of a distribution. - -Below, we will print out the distribution for the first 10 time steps, -starting from a distribution where the debtor is repaying in the first step. - -```{code-cell} python -x = x0 -for t in range(10): - print(f"At time {t} the distribution is {x}") - x = mc.P.T @ x -``` - -````{admonition} Exercise -:name: dir3-4-2 - -See exercise 2 in the {ref}`exercise list `. -```` - -````{admonition} Exercise -:name: dir3-4-3 - -See exercise 3 in the {ref}`exercise list `. -```` - -### Continuous Distributions - -Recall that a continuous distribution is one where the value can take on an uncountable number of values. - -It differs from a discrete distribution in that the events are not -countable. - -We can use simulation to learn things about continuous distributions as we did with discrete -distributions. - -Let's use simulation to study what is arguably the most commonly encountered -distributions -- the normal distribution. - -The Normal (sometimes referred to as the Gaussian distribution) is bell-shaped and completely -described by the mean and variance of that distribution. - -The mean is often referred to as $\mu$ and the variance as $\sigma^2$. - -Let's take a look at the normal distribution. - -```{code-cell} python -# scipy is an extension of numpy, and the stats -# subpackage has tools for working with various probability distributions -import scipy.stats as st - -x = np.linspace(-5, 5, 100) - -# NOTE: first argument to st.norm is mean, second is standard deviation sigma (not sigma^2) -pdf_x = st.norm(0.0, 1.0).pdf(x) - -fig, ax = plt.subplots() - -ax.set_title(r"Normal Distribution ($\mu = 0, \sigma = 1$)") -ax.plot(x, pdf_x) -``` - -Another common continuous distribution used in economics is the gamma distribution. - -A gamma distribution is defined for all positive numbers and described by both a shape -parameter $k$ and a scale parameter $\theta$. - -Let's see what the distribution looks like for various choices of $k$ and $\theta$. - -```{code-cell} python -def plot_gamma(k, theta, x, ax=None): - if ax is None: - _, ax = plt.subplots() - - # scipy refers to the rate parameter beta as a scale parameter - pdf_x = st.gamma(k, scale=theta).pdf(x) - ax.plot(x, pdf_x, label=f"k = {k} theta = {theta}") - - return ax - -fig, ax = plt.subplots(figsize=(10, 6)) -x = np.linspace(0.1, 20, 130) -plot_gamma(2.0, 1.0, x, ax) -plot_gamma(3.0, 1.0, x, ax) -plot_gamma(3.0, 2.0, x, ax) -plot_gamma(3.0, 0.5, x, ax) -ax.set_ylim((0, 0.6)) -ax.set_xlim((0, 20)) -ax.legend(); -``` - -````{admonition} Exercise -:name: dir3-4-4 - -See exercise 4 in the {ref}`exercise list `. -```` - - -(ex3-4)= -## Exercises - -### Exercise 1 - -Wikipedia and other credible statistics sources tell us that the mean and -variance of the Uniform(0, 1) distribution are (1/2, 1/12) respectively. - -How could we check whether the numpy random numbers approximate these -values? - -({ref}`back to text `) - -### Exercise 2 - -In this exercise, we explore the long-run, or stationary, distribution of the Markov chain. - -The stationary distribution of a Markov chain is the probability distribution that would -result after an infinite number of steps *for any initial distribution*. - -Mathematically, a stationary distribution $x$ is a distribution where $x = P'x$. - -In the code cell below, use the `stationary_distributions` property of `mc` to -determine the stationary distribution of our Markov chain. - -After doing your computation, think about the answer... think about why our transition -probabilities must lead to this outcome. - - -```{code-cell} python -# your code here -``` - -({ref}`back to text `) - -### Exercise 3 - -Let's revisit the unemployment example from the {doc}`linear algebra lecture `. - -We'll repeat necessary details here. - -Consider an economy where in any given year, $\alpha = 5\%$ of workers lose their jobs, and -$\phi = 10\%$ of unemployed workers find jobs. - -Initially, 90% of the 1,000,000 workers are employed. - -Also suppose that the average employed worker earns 10 dollars, while an unemployed worker -earns 1 dollar per period. - -You now have four tasks: - -1. Represent this problem as a Markov chain by defining the three components defined above. -1. Construct an instance of the quantecon MarkovChain by using the objects defined in part 1. -1. Simulate the Markov chain 30 times for 50 time periods, and plot each chain over time (see - helper code below). -1. Determine the average long run payment for a worker in this setting - -```{hint} -Think about the stationary distribution. -``` - -```{code-cell} python -# define components here - -# construct Markov chain - -# simulate (see docstring for how to do many repetitions of -# the simulation in one function call) -# uncomment the lines below and fill in the blanks -# sim = XXXXX.simulate(XXXX) -# fig, ax = plt.subplots(figsize=(10, 8)) -# ax.plot(range(50), sim.T, alpha=0.4) - -# Long-run average payment -``` - -({ref}`back to text `) - - -### Exercise 4 - -Assume you have been given the opportunity to choose between one of three financial assets: - -You will be given the asset for free, allowed to hold it indefinitely, and keeping all payoffs. - -Also assume the assets' payoffs are distributed as follows: - -1. Normal with $\mu = 10, \sigma = 5$ -1. Gamma with $k = 5.3, \theta = 2$ -1. Gamma with $k = 5, \theta = 2$ - -Use `scipy.stats` to answer the following questions: - -- Which asset has the highest average returns? -- Which asset has the highest median returns? -- Which asset has the lowest coefficient of variation (standard deviation divided by mean)? -- Which asset would you choose? Why? - -```{hint} -There is not a single right answer here. Be creative -and express your preferences. -``` - -```{code-cell} python -# your code here -``` - -({ref}`back to text `) +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Randomness + +**Prerequisites** + +- {doc}`Introduction to Numpy ` +- {doc}`Applied Linear Algebra ` + +**Outcomes** + +- Recall basic probability +- Draw random numbers from numpy +- Understand why simulation is useful +- Understand the basics of Markov chains and using the `quantecon` library to study them +- Simulate discrete and continuous random variables and processes + + +```{literalinclude} ../_static/colab_light.raw +``` + +## Randomness + +We will use the `numpy.random` package to simulate randomness in Python. + +This lecture will present various probability distributions and then use +numpy.random to numerically verify some of the facts associated with them. + +We import `numpy` as usual + +```{code-cell} python +import numpy as np +import matplotlib.pyplot as plt +%matplotlib inline +``` + +### Probability + +Before we learn how to use Python to generate randomness, we should make sure +that we all agree on some basic concepts of probability. + +To think about the probability of some event occurring, we must understand what possible +events could occur -- mathematicians refer to this as the *event space*. + +Some examples are + +* For a coin flip, the coin could either come up heads, tails, or land on its side. +* The inches of rain falling in a certain location on a given day could be any real + number between 0 and $\infty$. +* The change in an S&P500 stock price could be any real number between + $-$ opening price and $\infty$. +* An individual's employment status tomorrow could either be employed or unemployed. +* And the list goes on... + +Notice that in some of these cases, the event space can be counted (coin flip and employment status) +while in others, the event space cannot be counted (rain and stock prices). + +We refer to random variables with countable event spaces as *discrete random variables* and +random variables with uncountable event spaces as *continuous random variables*. + +We then call certain numbers 'probabilities' and associate them with events from the event space. + +The following is true about probabilities. + +1. The probability of any event must be greater than or equal to 0. +1. The probability of all events from the event space must sum (or integrate) to 1. +1. If two events cannot occur at same time, then the probability that at least one of them occurs is + the sum of the probabilities that each event occurs (known as independence). + +We won't rely on these for much of what we learn in this class, but occasionally, these facts will +help us reason through what is happening. + +### Simulating Randomness in Python + +One of the most basic random numbers is a variable that has equal probability of being any value +between 0 and 1. + +You may have previously learned about this probability distribution as the Uniform(0, 1). + +Let's dive into generating some random numbers. + +Run the code below multiple times and see what numbers you get. + +```{code-cell} python +np.random.rand() +``` + +We can also generate arrays of random numbers. + +```{code-cell} python +np.random.rand(25) +``` + +```{code-cell} python +np.random.rand(5, 5) +``` + +```{code-cell} python +np.random.rand(2, 3, 4) +``` + +### Why Do We Need Randomness? + +As economists and data scientists, we study complex systems. + +These systems have inherent randomness, but they do not readily reveal their underlying distribution +to us. + +In cases where we face this difficulty, we turn to a set of tools known as Monte Carlo +methods. + +These methods effectively boil down to repeatedly simulating some event (or events) and looking at +the outcome distribution. + +This tool is used to inform decisions in search and rescue missions, election predictions, sports, +and even by the Federal Reserve. + +The reasons that Monte Carlo methods work is a mathematical theorem known as the *Law of Large +Numbers*. + +The Law of Large Numbers basically says that under relatively general conditions, the distribution of simulated outcomes will mimic the true distribution as the number of simulated events goes to infinity. + +We already know how the uniform distribution looks, so let's demonstrate the Law of Large Numbers by approximating the uniform distribution. + +```{code-cell} python +# Draw various numbers of uniform[0, 1] random variables +draws_10 = np.random.rand(10) +draws_200 = np.random.rand(200) +draws_10000 = np.random.rand(10_000) + +# Plot their histograms +fig, ax = plt.subplots(3) + +ax[0].set_title("Histogram with 10 draws") +ax[0].hist(draws_10) + +ax[1].set_title("Histogram with 200 draws") +ax[1].hist(draws_200) + +ax[2].set_title("Histogram with 10,000 draws") +ax[2].hist(draws_10000) + +fig.tight_layout() +``` + + +````{admonition} Exercise +:name: dir3-4-1 + +See exercise 1 in the {ref}`exercise list `. +```` + + +### Discrete Distributions + +Sometimes we will encounter variables that can only take one of a +few possible values. + +We refer to this type of random variable as a discrete distribution. + +For example, consider a small business loan company. + +Imagine that the company's loan requires a repayment of $\$25,000$ and must be repaid 1 year +after the loan was made. + +The company discounts the future at 5%. + +Additionally, the loans made are repaid in full with 75% probability, while +$\$12,500$ of loans is repaid with probability 20%, and no repayment with 5% +probability. + +How much would the small business loan company be willing to loan if they'd like to +-- on average -- break even? + +In this case, we can compute this by hand: + +The amount repaid, on average, is: $0.75(25,000) + 0.2(12,500) + 0.05(0) = 21,250$. + +Since we'll receive that amount in one year, we have to discount it: +$\frac{1}{1+0.05} 21,250 \approx 20238$. + +We can now verify by simulating the outcomes of many loans. + +```{code-cell} python +# You'll see why we call it `_slow` soon :) +def simulate_loan_repayments_slow(N, r=0.05, repayment_full=25_000.0, + repayment_part=12_500.0): + repayment_sims = np.zeros(N) + for i in range(N): + x = np.random.rand() # Draw a random number + + # Full repayment 75% of time + if x < 0.75: + repaid = repayment_full + elif x < 0.95: + repaid = repayment_part + else: + repaid = 0.0 + + repayment_sims[i] = (1 / (1 + r)) * repaid + + return repayment_sims + +print(np.mean(simulate_loan_repayments_slow(25_000))) +``` + +#### Aside: Vectorized Computations + +The code above illustrates the concepts we were discussing but is much slower than +necessary. + +Below is a version of our function that uses numpy arrays to perform computations +instead of only storing the values. + +```{code-cell} python +def simulate_loan_repayments(N, r=0.05, repayment_full=25_000.0, + repayment_part=12_500.0): + """ + Simulate present value of N loans given values for discount rate and + repayment values + """ + random_numbers = np.random.rand(N) + + # start as 0 -- no repayment + repayment_sims = np.zeros(N) + + # adjust for full and partial repayment + partial = random_numbers <= 0.20 + repayment_sims[partial] = repayment_part + + full = ~partial & (random_numbers <= 0.95) + repayment_sims[full] = repayment_full + + repayment_sims = (1 / (1 + r)) * repayment_sims + + return repayment_sims + +np.mean(simulate_loan_repayments(25_000)) +``` + +We'll quickly demonstrate the time difference in running both function versions. + +```{code-cell} python +%timeit simulate_loan_repayments_slow(250_000) +``` + +```{code-cell} python +%timeit simulate_loan_repayments(250_000) +``` + +The timings for my computer were 167 ms for `simulate_loan_repayments_slow` and 5.05 ms for +`simulate_loan_repayments`. + +This function is simple enough that both times are acceptable, but the 33x time difference could +matter in a more complicated operation. + +This illustrates a concept called *vectorization*, which is when computations +operate on an entire array at a time. + +In general, numpy code that is *vectorized* will perform better than numpy code that operates on one +element at a time. + +For more information see the +[QuantEcon lecture on performance Python](https://python-programming.quantecon.org/numba.html) code. + +#### Profitability Threshold + +Rather than looking for the break even point, we might be interested in the largest loan size that +ensures we still have a 95% probability of profitability in a year we make 250 loans. + +This is something that could be computed by hand, but it is much easier to answer through +simulation! + +If we simulate 250 loans many times and keep track of what the outcomes look like, then we can look +at the the 5th percentile of total repayment to find the loan size needed for 95% probability of +being profitable. + +```{code-cell} python +def simulate_year_of_loans(N=250, K=1000): + + # Create array where we store the values + avg_repayments = np.zeros(K) + for year in range(K): + + repaid_year = 0.0 + n_loans = simulate_loan_repayments(N) + avg_repayments[year] = n_loans.mean() + + return avg_repayments + +loan_repayment_outcomes = simulate_year_of_loans(N=250) + +# Think about why we use the 5th percentile of outcomes to +# compute when we are profitable 95% of time +lro_5 = np.percentile(loan_repayment_outcomes, 5) + +print("The largest loan size such that we were profitable 95% of time is") +print(lro_5) +``` + +Now let's consider what we could learn if our loan company had even more detailed information about +how the life of their loans progressed. + +#### Loan States + +Loans can have 3 potential statuses (or states): + +1. Repaying: Payments are being made on loan. +1. Delinquency: No payments are currently being made, but they might be made in the future. +1. Default: No payments are currently being made and no more payments will be made in future. + +The small business loans company knows the following: + +* If a loan is currently in repayment, then it has an 85% probability of continuing being repaid, a + 10% probability of going into delinquency, and a 5% probability of going into default. +* If a loan is currently in delinquency, then it has a 25% probability of returning to repayment, a + 60% probability of staying delinquent, and a 15% probability of going into default. +* If a loan is currently in default, then it remains in default with 100% probability. + +For simplicity, let's imagine that 12 payments are made during the life of a loan, even though +this means people who experience delinquency won't be required to repay their remaining balance. + +Let's write the code required to perform this dynamic simulation. + +```{code-cell} python +def simulate_loan_lifetime(monthly_payment): + + # Create arrays to store outputs + payments = np.zeros(12) + # Note: dtype 'U12' means a string with no more than 12 characters + statuses = np.array(4*["repaying", "delinquency", "default"], dtype="U12") + + # Everyone is repaying during their first month + payments[0] = monthly_payment + statuses[0] = "repaying" + + for month in range(1, 12): + rn = np.random.rand() + + if (statuses[month-1] == "repaying"): + if rn < 0.85: + payments[month] = monthly_payment + statuses[month] = "repaying" + elif rn < 0.95: + payments[month] = 0.0 + statuses[month] = "delinquency" + else: + payments[month] = 0.0 + statuses[month] = "default" + elif (statuses[month-1] == "delinquency"): + if rn < 0.25: + payments[month] = monthly_payment + statuses[month] = "repaying" + elif rn < 0.85: + payments[month] = 0.0 + statuses[month] = "delinquency" + else: + payments[month] = 0.0 + statuses[month] = "default" + else: # Default -- Stays in default after it gets there + payments[month] = 0.0 + statuses[month] = "default" + + return payments, statuses +``` + +We can use this model of the world to answer even more questions than the last model! + +For example, we can think about things like + +* For the defaulted loans, how many payments did they make before going into default? +* For those who partially repaid, how much was repaid before the 12 months was over? + +Unbeknownst to you, we have just introduced a well-known mathematical concept known as a Markov +chain. + +A Markov chain is a random process (Note: Random process is a sequence of random variables +observed over time) where the probability of something happening tomorrow only depends on what we +can observe today. + +In our small business loan example, this just means that the small business loan's repayment status +tomorrow only depended on what its repayment status was today. + +Markov chains often show up in economics and statistics, so we decided a simple introduction would +be helpful, but we leave out many details for the interested reader to find. + +A Markov chain is defined by three objects: + +1. A description of the possible states and their associated value. +1. A complete description of the probability of moving from one state to all other states. +1. An initial distribution over the states (often a vector of all zeros except for a single 1 for + some particular state). + +For the example above, we'll define each of these three things in the Python code below. + +```{code-cell} python +# 1. State description +state_values = ["repaying", "delinquency", "default"] + +# 2. Transition probabilities: encoded in a matrix (2d-array) where element [i, j] +# is the probability of moving from state i to state j +P = np.array([[0.85, 0.1, 0.05], [0.25, 0.6, 0.15], [0, 0, 1]]) + +# 3. Initial distribution: assume loans start in repayment +x0 = np.array([1, 0, 0]) +``` + +Now that we have these objects defined, we can use the a `MarkovChain` class from the +[quantecon python library](https://github.com/QuantEcon/QuantEcon.py/) to analyze this model. + +```{code-cell} python +import quantecon as qe + +mc = qe.markov.MarkovChain(P, state_values) +``` + +We can use the `mc` object to do common Markov chain operations. + +The `simulate` method will simulate the Markov chain for a specified number of steps: + +```{code-cell} python +mc.simulate(12, init="repaying") +``` + +Suppose we were to simulate the Markov chain for an infinite number of steps. + +Given the random nature of transitions, we might end up taking different paths at any given moment. + +We can summarize all possible paths over time by keeping track of a distribution. + +Below, we will print out the distribution for the first 10 time steps, +starting from a distribution where the debtor is repaying in the first step. + +```{code-cell} python +x = x0 +for t in range(10): + print(f"At time {t} the distribution is {x}") + x = mc.P.T @ x +``` + +````{admonition} Exercise +:name: dir3-4-2 + +See exercise 2 in the {ref}`exercise list `. +```` + +````{admonition} Exercise +:name: dir3-4-3 + +See exercise 3 in the {ref}`exercise list `. +```` + +### Continuous Distributions + +Recall that a continuous distribution is one where the value can take on an uncountable number of values. + +It differs from a discrete distribution in that the events are not +countable. + +We can use simulation to learn things about continuous distributions as we did with discrete +distributions. + +Let's use simulation to study what is arguably the most commonly encountered +distributions -- the normal distribution. + +The Normal (sometimes referred to as the Gaussian distribution) is bell-shaped and completely +described by the mean and variance of that distribution. + +The mean is often referred to as $\mu$ and the variance as $\sigma^2$. + +Let's take a look at the normal distribution. + +```{code-cell} python +# scipy is an extension of numpy, and the stats +# subpackage has tools for working with various probability distributions +import scipy.stats as st + +x = np.linspace(-5, 5, 100) + +# NOTE: first argument to st.norm is mean, second is standard deviation sigma (not sigma^2) +pdf_x = st.norm(0.0, 1.0).pdf(x) + +fig, ax = plt.subplots() + +ax.set_title(r"Normal Distribution ($\mu = 0, \sigma = 1$)") +ax.plot(x, pdf_x) +``` + +Another common continuous distribution used in economics is the gamma distribution. + +A gamma distribution is defined for all positive numbers and described by both a shape +parameter $k$ and a scale parameter $\theta$. + +Let's see what the distribution looks like for various choices of $k$ and $\theta$. + +```{code-cell} python +def plot_gamma(k, theta, x, ax=None): + if ax is None: + _, ax = plt.subplots() + + # scipy refers to the rate parameter beta as a scale parameter + pdf_x = st.gamma(k, scale=theta).pdf(x) + ax.plot(x, pdf_x, label=f"k = {k} theta = {theta}") + + return ax + +fig, ax = plt.subplots(figsize=(10, 6)) +x = np.linspace(0.1, 20, 130) +plot_gamma(2.0, 1.0, x, ax) +plot_gamma(3.0, 1.0, x, ax) +plot_gamma(3.0, 2.0, x, ax) +plot_gamma(3.0, 0.5, x, ax) +ax.set_ylim((0, 0.6)) +ax.set_xlim((0, 20)) +ax.legend(); +``` + +````{admonition} Exercise +:name: dir3-4-4 + +See exercise 4 in the {ref}`exercise list `. +```` + + +(ex3-4)= +## Exercises + +### Exercise 1 + +Wikipedia and other credible statistics sources tell us that the mean and +variance of the Uniform(0, 1) distribution are (1/2, 1/12) respectively. + +How could we check whether the numpy random numbers approximate these +values? + +({ref}`back to text `) + +### Exercise 2 + +In this exercise, we explore the long-run, or stationary, distribution of the Markov chain. + +The stationary distribution of a Markov chain is the probability distribution that would +result after an infinite number of steps *for any initial distribution*. + +Mathematically, a stationary distribution $x$ is a distribution where $x = P'x$. + +In the code cell below, use the `stationary_distributions` property of `mc` to +determine the stationary distribution of our Markov chain. + +After doing your computation, think about the answer... think about why our transition +probabilities must lead to this outcome. + + +```{code-cell} python +# your code here +``` + +({ref}`back to text `) + +### Exercise 3 + +Let's revisit the unemployment example from the {doc}`linear algebra lecture `. + +We'll repeat necessary details here. + +Consider an economy where in any given year, $\alpha = 5\%$ of workers lose their jobs, and +$\phi = 10\%$ of unemployed workers find jobs. + +Initially, 90% of the 1,000,000 workers are employed. + +Also suppose that the average employed worker earns 10 dollars, while an unemployed worker +earns 1 dollar per period. + +You now have four tasks: + +1. Represent this problem as a Markov chain by defining the three components defined above. +1. Construct an instance of the quantecon MarkovChain by using the objects defined in part 1. +1. Simulate the Markov chain 30 times for 50 time periods, and plot each chain over time (see + helper code below). +1. Determine the average long run payment for a worker in this setting + +```{hint} +Think about the stationary distribution. +``` + +```{code-cell} python +# define components here + +# construct Markov chain + +# simulate (see docstring for how to do many repetitions of +# the simulation in one function call) +# uncomment the lines below and fill in the blanks +# sim = XXXXX.simulate(XXXX) +# fig, ax = plt.subplots(figsize=(10, 8)) +# ax.plot(range(50), sim.T, alpha=0.4) + +# Long-run average payment +``` + +({ref}`back to text `) + + +### Exercise 4 + +Assume you have been given the opportunity to choose between one of three financial assets: + +You will be given the asset for free, allowed to hold it indefinitely, and keeping all payoffs. + +Also assume the assets' payoffs are distributed as follows: + +1. Normal with $\mu = 10, \sigma = 5$ +1. Gamma with $k = 5.3, \theta = 2$ +1. Gamma with $k = 5, \theta = 2$ + +Use `scipy.stats` to answer the following questions: + +- Which asset has the highest average returns? +- Which asset has the highest median returns? +- Which asset has the lowest coefficient of variation (standard deviation divided by mean)? +- Which asset would you choose? Why? + +```{hint} +There is not a single right answer here. Be creative +and express your preferences. +``` + +```{code-cell} python +# your code here +``` + +({ref}`back to text `) From f5a1415f783f821c4b204d3b8f8b19571a20e0a4 Mon Sep 17 00:00:00 2001 From: Phil Solimine <15682144+doctor-phil@users.noreply.github.com> Date: Tue, 11 Oct 2022 09:35:00 -0800 Subject: [PATCH 3/3] Revert "Update LA lecture" This reverts commit ac5c12d958f2c6c06c0cfa051a3f4c30f06f0921. --- lectures/scientific/applied_linalg.md | 1580 ++++++++++++------------- lectures/scientific/index.md | 90 +- lectures/scientific/numpy_arrays.md | 1102 ++++++++--------- lectures/scientific/optimization.md | 928 +++++++-------- lectures/scientific/plotting.md | 412 +++---- lectures/scientific/randomness.md | 1282 ++++++++++---------- 6 files changed, 2697 insertions(+), 2697 deletions(-) diff --git a/lectures/scientific/applied_linalg.md b/lectures/scientific/applied_linalg.md index a87e2d75..e039b4c3 100644 --- a/lectures/scientific/applied_linalg.md +++ b/lectures/scientific/applied_linalg.md @@ -1,790 +1,790 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# {index}`Applied Linear Algebra ` - -**Prerequisites** - -- {doc}`Introduction to Numpy ` - -**Outcomes** - -- Refresh some important linear algebra concepts -- Apply concepts to understanding unemployment and pricing portfolios -- Use `numpy` to do linear algebra operations - - -```{literalinclude} ../_static/colab_light.raw -``` - -```{code-cell} python -# import numpy to prepare for code below -import numpy as np -import matplotlib.pyplot as plt - -%matplotlib inline -``` - -## Vectors and Matrices - -### Vectors - -A (N-element) vector is $N$ numbers stored together. - -We typically write a vector as $x = \begin{bmatrix} x_1 \\ x_2 \\ \dots \\ x_N \end{bmatrix}$. - -In numpy terms, a vector is a 1-dimensional array. - -We often think of 2-element vectors as directional lines in the XY axes. - -This image, from the [QuantEcon Python lecture](https://python.quantecon.org/linear_algebra.html) -is an example of what this might look like for the vectors `(-4, 3.5)`, `(-3, 3)`, and `(2, 4)`. - -```{figure} ../_static/vector.png -:alt: vector.png -``` - -In a previous lecture, we saw some types of operations that can be done on -vectors, such as - -```{code-cell} python -x = np.array([1, 2, 3]) -y = np.array([4, 5, 6]) -``` - -**Element-wise operations**: Let $z = x ? y$ for some operation $?$, one of -the standard *binary* operations ($+, -, \times, \div$). Then we can write -$z = \begin{bmatrix} x_1 ? y_1 & x_2 ? y_2 \end{bmatrix}$. Element-wise operations require -that $x$ and $y$ have the same size. - -```{code-cell} python -print("Element-wise Addition", x + y) -print("Element-wise Subtraction", x - y) -print("Element-wise Multiplication", x * y) -print("Element-wise Division", x / y) -``` - -**Scalar operations**: Let $w = a ? x$ for some operation $?$, one of the -standard *binary* operations ($+, -, \times, \div$). Then we can write -$w = \begin{bmatrix} a ? x_1 & a ? x_2 \end{bmatrix}$. - -```{code-cell} python -print("Scalar Addition", 3 + x) -print("Scalar Subtraction", 3 - x) -print("Scalar Multiplication", 3 * x) -print("Scalar Division", 3 / x) -``` - -Another operation very frequently used in data science is the **dot product**. - -The dot between $x$ and $y$ is written $x \cdot y$ and is -equal to $\sum_{i=1}^N x_i y_i$. - -```{code-cell} python -print("Dot product", np.dot(x, y)) -``` - -We can also use `@` to denote dot products (and matrix multiplication which we'll see soon!). - -```{code-cell} python -print("Dot product with @", x @ y) -``` - -````{admonition} Exercise -:name: dir3-3-1 - -See exercise 1 in the {ref}`exercise list `. -```` - -```{code-cell} python ---- -tags: [hide-output] ---- -nA = 100 -nB = 50 -nassets = np.array([nA, nB]) - -i = 0.05 -durationA = 6 -durationB = 4 - -# Do your computations here - -# Compute price - -# uncomment below to see a message! -# if condition: -# print("Alice can retire") -# else: -# print("Alice cannot retire yet") -``` - -### Matrices - -An $N \times M$ matrix can be thought of as a collection of M -N-element vectors stacked side-by-side as columns. - -We write a matrix as - -$$ -\begin{bmatrix} x_{11} & x_{12} & \dots & x_{1M} \\ - x_{21} & \dots & \dots & x_{2M} \\ - \vdots & \vdots & \vdots & \vdots \\ - x_{N1} & x_{N2} & \dots & x_{NM} -\end{bmatrix} -$$ - -In numpy terms, a matrix is a 2-dimensional array. - -We can create a matrix by passing a list of lists to the `np.array` function. - -```{code-cell} python -x = np.array([[1, 2, 3], [4, 5, 6]]) -y = np.ones((2, 3)) -z = np.array([[1, 2], [3, 4], [5, 6]]) -``` - -We can perform element-wise and scalar operations as we did with vectors. In fact, we can do -these two operations on arrays of any dimension. - -```{code-cell} python -print("Element-wise Addition\n", x + y) -print("Element-wise Subtraction\n", x - y) -print("Element-wise Multiplication\n", x * y) -print("Element-wise Division\n", x / y) - -print("Scalar Addition\n", 3 + x) -print("Scalar Subtraction\n", 3 - x) -print("Scalar Multiplication\n", 3 * x) -print("Scalar Division\n", 3 / x) -``` - -Similar to how we combine vectors with a dot product, matrices can do what we'll call *matrix -multiplication*. - -Matrix multiplication is effectively a generalization of dot products. - -**Matrix multiplication**: Let $v = x \cdot y$ then we can write -$v_{ij} = \sum_{k=1}^N x_{ik} y_{kj}$ where $x_{ij}$ is notation that denotes the -element found in the ith row and jth column of the matrix $x$. - -The image below from [Wikipedia](https://commons.wikimedia.org/wiki/File:Matrix_multiplication_diagram.svg), -by Bilou, shows how matrix multiplication simplifies to a series of dot products: - -```{figure} ../_static/mat_mult_wiki_bilou.png -:alt: matmult.png -``` - -After looking at the math and image above, you might have realized that matrix -multiplication requires very specific matrix shapes! - -For two matrices $x, y$ to be multiplied, $x$ -must have the same number of columns as $y$ has rows. - -Formally, we require that for some integer numbers, $M, N,$ and $K$ -that if $x$ is $N \times M$ then $y$ must be $M \times -K$. - -If we think of a vector as a $1 \times M$ or $M \times 1$ matrix, we can even do -matrix multiplication between a matrix and a vector! - -Let's see some examples of this. - -```{code-cell} python -x1 = np.reshape(np.arange(6), (3, 2)) -x2 = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]) -x3 = np.array([[2, 5, 2], [1, 2, 1]]) -x4 = np.ones((2, 3)) - -y1 = np.array([1, 2, 3]) -y2 = np.array([0.5, 0.5]) -``` - -Numpy allows us to do matrix multiplication in three ways. - -```{code-cell} python -print("Using the matmul function for two matrices") -print(np.matmul(x1, x4)) -print("Using the dot function for two matrices") -print(np.dot(x1, x4)) -print("Using @ for two matrices") -print(x1 @ x4) -``` - -```{code-cell} python -print("Using the matmul function for vec and mat") -print(np.matmul(y1, x1)) -print("Using the dot function for vec and mat") -print(np.dot(y1, x1)) -print("Using @ for vec and mat") -print(y1 @ x1) -``` - -Despite our options, we stick to using `@` because -it is simplest to read and write. - - -````{admonition} Exercise -:name: dir3-3-2 - -See exercise 2 in the {ref}`exercise list `. -```` - - -### Other Linear Algebra Concepts - -#### Transpose - -A matrix transpose is an operation that flips all elements of a matrix along the diagonal. - -More formally, the $(i, j)$ element of $x$ becomes the $(j, i)$ element of -$x^T$. - -In particular, let $x$ be given by - -$$ -x = \begin{bmatrix} 1 & 2 & 3 \\ - 4 & 5 & 6 \\ - 7 & 8 & 9 \\ - \end{bmatrix} -$$ - -then $x$ transpose, written as $x'$, is given by - -$$ -x = \begin{bmatrix} 1 & 4 & 7 \\ - 2 & 5 & 8 \\ - 3 & 6 & 9 \\ - \end{bmatrix} -$$ - -In Python, we do this by - -```{code-cell} python -x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) - -print("x transpose is") -print(x.transpose()) -``` - -#### Identity Matrix - -In linear algebra, one particular matrix acts very similarly to how 1 behaves for scalar numbers. - -This matrix is known as the *identity matrix* and is given by - -$$ -I = \begin{bmatrix} 1 & 0 & 0 & \dots & 0 \\ - 0 & 1 & 0 & \dots & 0 \\ - \vdots & \vdots & \ddots & \vdots & \vdots \\ - 0 & 0 & 0 & \dots & 1 - \end{bmatrix} -$$ - -As seen above, it has 1s on the diagonal and 0s everywhere else. - -When we multiply any matrix or vector by the identity matrix, we get the original matrix or vector -back! - -Let's see some examples. - -```{code-cell} python -I = np.eye(3) -x = np.reshape(np.arange(9), (3, 3)) -y = np.array([1, 2, 3]) - -print("I @ x", "\n", I @ x) -print("x @ I", "\n", x @ I) -print("I @ y", "\n", I @ y) -print("y @ I", "\n", y @ I) -``` - -#### Inverse - -If you recall, you learned in your primary education about solving equations for certain variables. - -For example, you might have been given the equation - -$$ -3x + 7 = 16 -$$ - -and then asked to solve for $x$. - -You probably did this by subtracting 7 and then dividing by 3. - -Now let's write an equation that contains matrices and vectors. - -$$ -\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 3 \\ 4 \end{bmatrix} -$$ - -How would we solve for $x = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}$? - -Unfortunately, there is no "matrix divide" operation that does the opposite of matrix multiplication. - -Instead, we first have to do what's known as finding the inverse. We must multiply both sides by this inverse to solve. - -Consider some matrix $A$. - -The inverse of $A$, given by $A^{-1}$, is a matrix such that $A A^{-1} = I$ -where $I$ is our identity matrix. - -Notice in our equation above, if we can find the inverse of -$\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}$ then we can multiply both sides by the inverse -to get - -$$ -\begin{align*} -\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1}\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1}\begin{bmatrix} 3 \\ 4 \end{bmatrix} \\ -I \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 4 \end{bmatrix} \\ - \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 4 \end{bmatrix} -\end{align*} -$$ - -Computing the inverse requires that a matrix be square and satisfy some other conditions -(non-singularity) that are beyond the scope of this lecture. - -We also skip the exact details of how this inverse is computed, but, if you are interested, -you can visit the -[QuantEcon Linear Algebra lecture](https://python.quantecon.org/linear_algebra.html) -for more details. - -We demonstrate how to compute the inverse with numpy below. - -```{code-cell} python -# This is a square (N x N) non-singular matrix -A = np.array([[1, 2, 0], [3, 1, 0], [0, 1, 2]]) - -print("This is A inverse") - -print(np.linalg.inv(A)) - -print("Check that A @ A inverse is I") -print(np.linalg.inv(A) @ A) -``` - -## Portfolios - -In {doc}`control flow <../python_fundamentals/control_flow>`, we learned to value a stream of payoffs from a single -asset. - -In this section, we generalize this to value a portfolio of multiple assets, or an asset -that has easily separable components. - -Vectors and inner products give us a convenient way to organize and calculate these payoffs. - -### Static Payoffs - -As an example, consider a portfolio with 4 units of asset A, 2.5 units of asset B, and 8 units of -asset C. - -At a particular point in time, the assets pay $3$/unit of asset A, $5$/unit of B, and -$1.10$/unit of C. - -First, calculate the value of this portfolio directly with a sum. - -```{code-cell} python -4.0 * 3.0 + 2.5 * 5.0 + 8 * 1.1 -``` - -We can make this more convenient and general by using arrays for accounting, and then sum then in a -loop. - -```{code-cell} python -import numpy as np -x = np.array([4.0, 2.5, 8.0]) # portfolio units -y = np.array([3.0, 5.0, 1.1]) # payoffs -n = len(x) -p = 0.0 -for i in range(n): # i.e. 0, 1, 2 - p = p + x[i] * y[i] - -p -``` - -The above would have worked with `x` and `y` as `list` rather than `np.array`. - -Note that the general pattern above is the sum. - -$$ -p = \sum_{i=0}^{n-1} x_i y_i = x \cdot y -$$ - -This is an inner product as implemented by the `np.dot` function - -```{code-cell} python -np.dot(x, y) -``` - -This approach allows us to simultaneously price different portfolios by stacking them in a matrix and using the dot product. - -```{code-cell} python -y = np.array([3.0, 5.0, 1.1]) # payoffs -x1 = np.array([4.0, 2.5, 8.0]) # portfolio 1 -x2 = np.array([2.0, 1.5, 0.0]) # portfolio 2 -X = np.array((x1, x2)) - -# calculate with inner products -p1 = np.dot(X[0,:], y) -p2 = np.dot(X[1,:], y) -print("Calculating separately") -print([p1, p2]) - -# or with a matrix multiplication -print("Calculating with matrices") -P = X @ y -print(P) -``` - -### NPV of a Portfolio - -If a set of assets has payoffs over time, we can calculate the NPV of that portfolio in a similar way to the calculation in -{ref}`npv `. - -First, consider an example with an asset with claims to multiple streams of payoffs which are easily -separated. - -You are considering purchasing an oilfield with 2 oil wells, named `A` and `B` where - -- Both oilfields have a finite lifetime of 20 years. -- In oilfield `A`, you can extract 5 units in the first year, and production in each subsequent year - decreases by $20\%$ of the previous year so that - $x^A_0 = 5, x^A_1 = 0.8 \times 5, x^A_2 = 0.8^2 \times 5, \ldots$ -- In oilfield `B`, you can extract 2 units in the first year, but production only drops by - $10\%$ each year (i.e. $x^B_0 = 2, x^B_1 = 0.9 \times 2, x^B_2 = 0.9^2 \times 2, \ldots$ -- Future cash flows are discounted at a rate of $r = 0.05$ each year. -- The price for oil in both wells are normalized as $p_A = p_B = 1$. - -These traits can be separated so that the price you would be willing to pay is the sum of the two, where -we define $\gamma_A = 0.8, \gamma_B = 0.9$. - -$$ -\begin{aligned} -V_A &= \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t p_A y^A_t = \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t (p_A \, x_{A0}\, \gamma_A^t)\\ -V_B &= \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t p_B y^B_t = \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t (p_B \, x_{B0}\, \gamma_B^t)\\ -V &= V_A + V_B -\end{aligned} -$$ - -Let's compute the value of each of these assets using the dot product. - -The first question to ask yourself is: "For which two vectors should I compute the dot product?" - -It turns out that this depends on which two vectors you'd like to create. - -One reasonable choice is presented in the code below. - -```{code-cell} python -# Depreciation of production rates -gamma_A = 0.80 -gamma_B = 0.90 - -# Interest rate discounting -r = 0.05 -discount = np.array([(1 / (1+r))**t for t in range(20)]) - -# Let's first create arrays that have the production of each oilfield -oil_A = 5 * np.array([gamma_A**t for t in range(20)]) -oil_B = 2 * np.array([gamma_B**t for t in range(20)]) -oilfields = np.array([oil_A, oil_B]) - -# Use matrix multiplication to get discounted sum of oilfield values and then sum -# the two values -Vs = oilfields @ discount - -print(f"The npv of oilfields is {Vs.sum()}") -``` - -Now consider the approximation where instead of the oilfields having a finite lifetime of 20 years, -we let them produce forever, i.e. $T = \infty$. - -With a little algebra, - -$$ -V_A = p_A \sum_{t=0}^{\infty}\left(\frac{1}{1 + r}\right)^t (x_{A0} \gamma_A^t) = x_{A0}\sum_{t=0}^{\infty}\left(\frac{\gamma_A}{1 + r}\right)^t -$$ - -And, using the infinite sum formula from {doc}`Control Flow <../python_fundamentals/control_flow>` (i.e. $\sum_{t=0}^{\infty}\beta^t = (1 - \beta)^{-1}$) - -$$ -= \frac{p_A x_{A0}}{1 - \left(\gamma_A\frac{1}{1 + r} \right)} -$$ - -The $V_B$ is defined symmetrically. - -How different is this infinite horizon approximation from the $T = 20$ version, and why? - -Now, let's compute the $T = \infty$ version of the net present value and make a graph to help -us see how many periods are needed to approach the infinite horizon value. - -```{code-cell} python -# Depreciation of production rates -gamma_A = 0.80 -gamma_B = 0.90 - -# Interest rate discounting -r = 0.05 - - -def infhor_NPV_oilfield(starting_output, gamma, r): - beta = gamma / (1 + r) - return starting_output / (1 - beta) - - -def compute_NPV_oilfield(starting_output, gamma, r, T): - outputs = starting_output * np.array([gamma**t for t in range(T)]) - discount = np.array([(1 / (1+r))**t for t in range(T)]) - - npv = np.dot(outputs, discount) - - return npv - -Ts = np.arange(2, 75) - -NPVs_A = np.array([compute_NPV_oilfield(5, gamma_A, r, t) for t in Ts]) -NPVs_B = np.array([compute_NPV_oilfield(2, gamma_B, r, t) for t in Ts]) - -NPVs_T = NPVs_A + NPVs_B -NPV_oo = infhor_NPV_oilfield(5, gamma_A, r) + infhor_NPV_oilfield(2, gamma_B, r) - -fig, ax = plt.subplots() - -ax.set_title("NPV with Varying T") -ax.set_ylabel("NPV") - -ax.plot(Ts, NPVs_A + NPVs_B) -ax.hlines(NPV_oo, Ts[0], Ts[-1], color="k", linestyle="--") # Plot infinite horizon value - -ax.spines["right"].set_visible(False) -ax.spines["top"].set_visible(False) -``` - -It is also worth noting that the computation of the infinite horizon net present value can be -simplified even further by using matrix multiplication. That is, the formula given above is -equivalent to - -$$ -V = \begin{bmatrix}p_A & p_B \end{bmatrix} \cdot \sum_{t=0}^{\infty} \left(\left(\frac{1}{1 + r}\right)^t \begin{bmatrix} \gamma_A & 0 \\ 0 & \gamma_B \end{bmatrix}^t \cdot x_0\right) -$$ - -and where $x_0 = \begin{bmatrix} x_{A0} \\ x_{B0} \end{bmatrix}$. - -We recognize that this equation is of the form - -$$ -V = G \sum_{t=0}^{\infty} \left(\frac{1}{1 + r}\right)^t A^t x_0 -$$ - -Without proof, and given important assumptions on $\frac{1}{1 + r}$ and $A$, this -equation reduces to - -```{math} -:label: eq_deterministic_asset_pricing - -V = G \left(I - \frac{1}{1+r} A\right)^{-1} x_0 -``` - -Using the matrix inverse, where `I` is the identity matrix. - -```{code-cell} python -p_A = 1.0 -p_B = 1.0 -G = np.array([p_A, p_B]) - -r = 0.05 -beta = 1 / (1 + r) - -gamma_A = 0.80 -gamma_B = 0.90 -A = np.array([[gamma_A, 0], [0, gamma_B]]) - -x_0 = np.array([5, 2]) - -# Compute with matrix formula -NPV_mf = G @ np.linalg.inv(np.eye(2) - beta*A) @ x_0 - -print(NPV_mf) -``` - -Note: While our matrix above was very simple, this approach works for much more -complicated `A` matrices as long as we can write $x_t$ using $A$ and $x_0$ as -$x_t = A^t x_0$ (For an advanced description of this topic, adding randomness, read about -linear state-space models with Python ). - -### Unemployment Dynamics - -Consider an economy where in any given year, $\alpha = 5\%$ of workers lose their jobs and -$\phi = 10\%$ of unemployed workers find jobs. - -Define the vector $x_0 = \begin{bmatrix} 900,000 & 100,000 \end{bmatrix}$ as the number of -employed and unemployed workers (respectively) at time $0$ in the economy. - -Our goal is to determine the dynamics of unemployment in this economy. - -First, let's define the matrix. - -$$ -A = \begin{bmatrix} 1 - \alpha & \alpha \\ \phi & 1 - \phi \end{bmatrix} -$$ - -Note that with this definition, we can describe the evolution of employment and unemployment -from $x_0$ to $x_1$ using linear algebra. - -$$ -x_1 = \begin{bmatrix} (1 - \alpha) 900,000 + \phi 100,000 \\ \alpha 900,000 + (1-\phi) 100,000\end{bmatrix} = A' x_0 -$$ - -However, since the transitions do not change over time, we can use this to describe the evolution -from any arbitrary time $t$, so that - -$$ -x_{t+1} = A' x_t -$$ - -Let's code up a python function that will let us track the evolution of unemployment over time. - -```{code-cell} python -phi = 0.1 -alpha = 0.05 - -x0 = np.array([900_000, 100_000]) - -A = np.array([[1-alpha, alpha], [phi, 1-phi]]) - -def simulate(x0, A, T=10): - """ - Simulate the dynamics of unemployment for T periods starting from x0 - and using values of A for probabilities of moving between employment - and unemployment - """ - nX = x0.shape[0] - out = np.zeros((T, nX)) - out[0, :] = x0 - - for t in range(1, T): - out[t, :] = A.T @ out[t-1, :] - - return out -``` - -Let's use this function to plot unemployment and employment levels for 10 periods. - -```{code-cell} python -def plot_simulation(x0, A, T=100): - X = simulate(x0, A, T) - fig, ax = plt.subplots() - ax.plot(X[:, 0]) - ax.plot(X[:, 1]) - ax.set_xlabel("t") - ax.legend(["Employed", "Unemployed"]) - return ax - -plot_simulation(x0, A, 50) -``` - -Notice that the levels of unemployed an employed workers seem to be heading to constant numbers. - -We refer to this phenomenon as *convergence* because the values appear to converge to a constant -number. - -Let's check that the values are permanently converging. - -```{code-cell} python -plot_simulation(x0, A, 5000) -``` - -The convergence of this system is a property determined by the matrix $A$. - -The long-run distribution of employed and unemployed workers is equal to the largest [eigenvector](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) -of $A'$, corresponding to the eigenvalue equal to 1. An eigenvalue of $A'$ is also known as a "left-eigenvector" of A. - -Let's have numpy compute the eigenvalues and eigenvectors and compare the results to our simulated results above: - -```{code-cell} python -eigvals, eigvecs = np.linalg.eig(A.T) -for i in range(len(eigvals)): - if eigvals[i] == 1: - which_eig = i - break - -print(f"We are looking for eigenvalue {which_eig}") -``` - -Now let's look at the corresponding eigenvector: - -```{code-cell} python -dist = eigvecs[:, which_eig] - -# need to divide by sum so it adds to 1 -dist /= dist.sum() - -print(f"The distribution of workers is given by {dist}") -``` - - -````{admonition} Exercise -:name: dir3-3-3 - -See exercise 3 in the {ref}`exercise list `. -```` - -(ex3-3)= -## Exercises - -### Exercise 1 - -Alice is a stock broker who owns two types of assets: A and B. She owns 100 -units of asset A and 50 units of asset B. The current interest rate is 5%. -Each of the A assets have a remaining duration of 6 years and pay -\$1500 each year, while each of the B assets have a remaining duration -of 4 years and pay \$500 each year. Alice would like to retire if she -can sell her assets for more than \$500,000. Use vector addition, scalar -multiplication, and dot products to determine whether she can retire. - -({ref}`back to text `) - -### Exercise 2 - -Which of the following operations will work and which will -create errors because of size issues? - -Test out your intuitions in the code cell below - -```{code-block} python -x1 @ x2 -x2 @ x1 -x2 @ x3 -x3 @ x2 -x1 @ x3 -x4 @ y1 -x4 @ y2 -y1 @ x4 -y2 @ x4 -``` - -```{code-cell} python -# testing area -``` - -({ref}`back to text `) - -### Exercise 3 - -Compare the distribution above to the final values of a long simulation. - -If you multiply the distribution by 1,000,000 (the number of workers), do you get (roughly) the same number as the simulation? - -```{code-cell} python -# your code here -``` - -({ref}`back to text `) +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# {index}`Applied Linear Algebra ` + +**Prerequisites** + +- {doc}`Introduction to Numpy ` + +**Outcomes** + +- Refresh some important linear algebra concepts +- Apply concepts to understanding unemployment and pricing portfolios +- Use `numpy` to do linear algebra operations + + +```{literalinclude} ../_static/colab_light.raw +``` + +```{code-cell} python +# import numpy to prepare for code below +import numpy as np +import matplotlib.pyplot as plt + +%matplotlib inline +``` + +## Vectors and Matrices + +### Vectors + +A (N-element) vector is $N$ numbers stored together. + +We typically write a vector as $x = \begin{bmatrix} x_1 \\ x_2 \\ \dots \\ x_N \end{bmatrix}$. + +In numpy terms, a vector is a 1-dimensional array. + +We often think of 2-element vectors as directional lines in the XY axes. + +This image, from the [QuantEcon Python lecture](https://python.quantecon.org/linear_algebra.html) +is an example of what this might look like for the vectors `(-4, 3.5)`, `(-3, 3)`, and `(2, 4)`. + +```{figure} ../_static/vector.png +:alt: vector.png +``` + +In a previous lecture, we saw some types of operations that can be done on +vectors, such as + +```{code-cell} python +x = np.array([1, 2, 3]) +y = np.array([4, 5, 6]) +``` + +**Element-wise operations**: Let $z = x ? y$ for some operation $?$, one of +the standard *binary* operations ($+, -, \times, \div$). Then we can write +$z = \begin{bmatrix} x_1 ? y_1 & x_2 ? y_2 \end{bmatrix}$. Element-wise operations require +that $x$ and $y$ have the same size. + +```{code-cell} python +print("Element-wise Addition", x + y) +print("Element-wise Subtraction", x - y) +print("Element-wise Multiplication", x * y) +print("Element-wise Division", x / y) +``` + +**Scalar operations**: Let $w = a ? x$ for some operation $?$, one of the +standard *binary* operations ($+, -, \times, \div$). Then we can write +$w = \begin{bmatrix} a ? x_1 & a ? x_2 \end{bmatrix}$. + +```{code-cell} python +print("Scalar Addition", 3 + x) +print("Scalar Subtraction", 3 - x) +print("Scalar Multiplication", 3 * x) +print("Scalar Division", 3 / x) +``` + +Another operation very frequently used in data science is the **dot product**. + +The dot between $x$ and $y$ is written $x \cdot y$ and is +equal to $\sum_{i=1}^N x_i y_i$. + +```{code-cell} python +print("Dot product", np.dot(x, y)) +``` + +We can also use `@` to denote dot products (and matrix multiplication which we'll see soon!). + +```{code-cell} python +print("Dot product with @", x @ y) +``` + +````{admonition} Exercise +:name: dir3-3-1 + +See exercise 1 in the {ref}`exercise list `. +```` + +```{code-cell} python +--- +tags: [hide-output] +--- +nA = 100 +nB = 50 +nassets = np.array([nA, nB]) + +i = 0.05 +durationA = 6 +durationB = 4 + +# Do your computations here + +# Compute price + +# uncomment below to see a message! +# if condition: +# print("Alice can retire") +# else: +# print("Alice cannot retire yet") +``` + +### Matrices + +An $N \times M$ matrix can be thought of as a collection of M +N-element vectors stacked side-by-side as columns. + +We write a matrix as + +$$ +\begin{bmatrix} x_{11} & x_{12} & \dots & x_{1M} \\ + x_{21} & \dots & \dots & x_{2M} \\ + \vdots & \vdots & \vdots & \vdots \\ + x_{N1} & x_{N2} & \dots & x_{NM} +\end{bmatrix} +$$ + +In numpy terms, a matrix is a 2-dimensional array. + +We can create a matrix by passing a list of lists to the `np.array` function. + +```{code-cell} python +x = np.array([[1, 2, 3], [4, 5, 6]]) +y = np.ones((2, 3)) +z = np.array([[1, 2], [3, 4], [5, 6]]) +``` + +We can perform element-wise and scalar operations as we did with vectors. In fact, we can do +these two operations on arrays of any dimension. + +```{code-cell} python +print("Element-wise Addition\n", x + y) +print("Element-wise Subtraction\n", x - y) +print("Element-wise Multiplication\n", x * y) +print("Element-wise Division\n", x / y) + +print("Scalar Addition\n", 3 + x) +print("Scalar Subtraction\n", 3 - x) +print("Scalar Multiplication\n", 3 * x) +print("Scalar Division\n", 3 / x) +``` + +Similar to how we combine vectors with a dot product, matrices can do what we'll call *matrix +multiplication*. + +Matrix multiplication is effectively a generalization of dot products. + +**Matrix multiplication**: Let $v = x \cdot y$ then we can write +$v_{ij} = \sum_{k=1}^N x_{ik} y_{kj}$ where $x_{ij}$ is notation that denotes the +element found in the ith row and jth column of the matrix $x$. + +The image below from [Wikipedia](https://commons.wikimedia.org/wiki/File:Matrix_multiplication_diagram.svg), +by Bilou, shows how matrix multiplication simplifies to a series of dot products: + +```{figure} ../_static/mat_mult_wiki_bilou.png +:alt: matmult.png +``` + +After looking at the math and image above, you might have realized that matrix +multiplication requires very specific matrix shapes! + +For two matrices $x, y$ to be multiplied, $x$ +must have the same number of columns as $y$ has rows. + +Formally, we require that for some integer numbers, $M, N,$ and $K$ +that if $x$ is $N \times M$ then $y$ must be $M \times +K$. + +If we think of a vector as a $1 \times M$ or $M \times 1$ matrix, we can even do +matrix multiplication between a matrix and a vector! + +Let's see some examples of this. + +```{code-cell} python +x1 = np.reshape(np.arange(6), (3, 2)) +x2 = np.array([[1, 2], [3, 4], [5, 6], [7, 8]]) +x3 = np.array([[2, 5, 2], [1, 2, 1]]) +x4 = np.ones((2, 3)) + +y1 = np.array([1, 2, 3]) +y2 = np.array([0.5, 0.5]) +``` + +Numpy allows us to do matrix multiplication in three ways. + +```{code-cell} python +print("Using the matmul function for two matrices") +print(np.matmul(x1, x4)) +print("Using the dot function for two matrices") +print(np.dot(x1, x4)) +print("Using @ for two matrices") +print(x1 @ x4) +``` + +```{code-cell} python +print("Using the matmul function for vec and mat") +print(np.matmul(y1, x1)) +print("Using the dot function for vec and mat") +print(np.dot(y1, x1)) +print("Using @ for vec and mat") +print(y1 @ x1) +``` + +Despite our options, we stick to using `@` because +it is simplest to read and write. + + +````{admonition} Exercise +:name: dir3-3-2 + +See exercise 2 in the {ref}`exercise list `. +```` + + +### Other Linear Algebra Concepts + +#### Transpose + +A matrix transpose is an operation that flips all elements of a matrix along the diagonal. + +More formally, the $(i, j)$ element of $x$ becomes the $(j, i)$ element of +$x^T$. + +In particular, let $x$ be given by + +$$ +x = \begin{bmatrix} 1 & 2 & 3 \\ + 4 & 5 & 6 \\ + 7 & 8 & 9 \\ + \end{bmatrix} +$$ + +then $x$ transpose, written as $x'$, is given by + +$$ +x = \begin{bmatrix} 1 & 4 & 7 \\ + 2 & 5 & 8 \\ + 3 & 6 & 9 \\ + \end{bmatrix} +$$ + +In Python, we do this by + +```{code-cell} python +x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) + +print("x transpose is") +print(x.transpose()) +``` + +#### Identity Matrix + +In linear algebra, one particular matrix acts very similarly to how 1 behaves for scalar numbers. + +This matrix is known as the *identity matrix* and is given by + +$$ +I = \begin{bmatrix} 1 & 0 & 0 & \dots & 0 \\ + 0 & 1 & 0 & \dots & 0 \\ + \vdots & \vdots & \ddots & \vdots & \vdots \\ + 0 & 0 & 0 & \dots & 1 + \end{bmatrix} +$$ + +As seen above, it has 1s on the diagonal and 0s everywhere else. + +When we multiply any matrix or vector by the identity matrix, we get the original matrix or vector +back! + +Let's see some examples. + +```{code-cell} python +I = np.eye(3) +x = np.reshape(np.arange(9), (3, 3)) +y = np.array([1, 2, 3]) + +print("I @ x", "\n", I @ x) +print("x @ I", "\n", x @ I) +print("I @ y", "\n", I @ y) +print("y @ I", "\n", y @ I) +``` + +#### Inverse + +If you recall, you learned in your primary education about solving equations for certain variables. + +For example, you might have been given the equation + +$$ +3x + 7 = 16 +$$ + +and then asked to solve for $x$. + +You probably did this by subtracting 7 and then dividing by 3. + +Now let's write an equation that contains matrices and vectors. + +$$ +\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 3 \\ 4 \end{bmatrix} +$$ + +How would we solve for $x = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}$? + +Unfortunately, there is no "matrix divide" operation that does the opposite of matrix multiplication. + +Instead, we first have to do what's known as finding the inverse. We must multiply both sides by this inverse to solve. + +Consider some matrix $A$. + +The inverse of $A$, given by $A^{-1}$, is a matrix such that $A A^{-1} = I$ +where $I$ is our identity matrix. + +Notice in our equation above, if we can find the inverse of +$\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}$ then we can multiply both sides by the inverse +to get + +$$ +\begin{align*} +\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1}\begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1}\begin{bmatrix} 3 \\ 4 \end{bmatrix} \\ +I \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 4 \end{bmatrix} \\ + \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} &= \begin{bmatrix} 1 & 2 \\ 3 & 1 \end{bmatrix}^{-1} \begin{bmatrix} 3 \\ 4 \end{bmatrix} +\end{align*} +$$ + +Computing the inverse requires that a matrix be square and satisfy some other conditions +(non-singularity) that are beyond the scope of this lecture. + +We also skip the exact details of how this inverse is computed, but, if you are interested, +you can visit the +[QuantEcon Linear Algebra lecture](https://python.quantecon.org/linear_algebra.html) +for more details. + +We demonstrate how to compute the inverse with numpy below. + +```{code-cell} python +# This is a square (N x N) non-singular matrix +A = np.array([[1, 2, 0], [3, 1, 0], [0, 1, 2]]) + +print("This is A inverse") + +print(np.linalg.inv(A)) + +print("Check that A @ A inverse is I") +print(np.linalg.inv(A) @ A) +``` + +## Portfolios + +In {doc}`control flow <../python_fundamentals/control_flow>`, we learned to value a stream of payoffs from a single +asset. + +In this section, we generalize this to value a portfolio of multiple assets, or an asset +that has easily separable components. + +Vectors and inner products give us a convenient way to organize and calculate these payoffs. + +### Static Payoffs + +As an example, consider a portfolio with 4 units of asset A, 2.5 units of asset B, and 8 units of +asset C. + +At a particular point in time, the assets pay $3$/unit of asset A, $5$/unit of B, and +$1.10$/unit of C. + +First, calculate the value of this portfolio directly with a sum. + +```{code-cell} python +4.0 * 3.0 + 2.5 * 5.0 + 8 * 1.1 +``` + +We can make this more convenient and general by using arrays for accounting, and then sum then in a +loop. + +```{code-cell} python +import numpy as np +x = np.array([4.0, 2.5, 8.0]) # portfolio units +y = np.array([3.0, 5.0, 1.1]) # payoffs +n = len(x) +p = 0.0 +for i in range(n): # i.e. 0, 1, 2 + p = p + x[i] * y[i] + +p +``` + +The above would have worked with `x` and `y` as `list` rather than `np.array`. + +Note that the general pattern above is the sum. + +$$ +p = \sum_{i=0}^{n-1} x_i y_i = x \cdot y +$$ + +This is an inner product as implemented by the `np.dot` function + +```{code-cell} python +np.dot(x, y) +``` + +This approach allows us to simultaneously price different portfolios by stacking them in a matrix and using the dot product. + +```{code-cell} python +y = np.array([3.0, 5.0, 1.1]) # payoffs +x1 = np.array([4.0, 2.5, 8.0]) # portfolio 1 +x2 = np.array([2.0, 1.5, 0.0]) # portfolio 2 +X = np.array((x1, x2)) + +# calculate with inner products +p1 = np.dot(X[0,:], y) +p2 = np.dot(X[1,:], y) +print("Calculating separately") +print([p1, p2]) + +# or with a matrix multiplication +print("Calculating with matrices") +P = X @ y +print(P) +``` + +### NPV of a Portfolio + +If a set of assets has payoffs over time, we can calculate the NPV of that portfolio in a similar way to the calculation in +{ref}`npv `. + +First, consider an example with an asset with claims to multiple streams of payoffs which are easily +separated. + +You are considering purchasing an oilfield with 2 oil wells, named `A` and `B` where + +- Both oilfields have a finite lifetime of 20 years. +- In oilfield `A`, you can extract 5 units in the first year, and production in each subsequent year + decreases by $20\%$ of the previous year so that + $x^A_0 = 5, x^A_1 = 0.8 \times 5, x^A_2 = 0.8^2 \times 5, \ldots$ +- In oilfield `B`, you can extract 2 units in the first year, but production only drops by + $10\%$ each year (i.e. $x^B_0 = 2, x^B_1 = 0.9 \times 2, x^B_2 = 0.9^2 \times 2, \ldots$ +- Future cash flows are discounted at a rate of $r = 0.05$ each year. +- The price for oil in both wells are normalized as $p_A = p_B = 1$. + +These traits can be separated so that the price you would be willing to pay is the sum of the two, where +we define $\gamma_A = 0.8, \gamma_B = 0.9$. + +$$ +\begin{aligned} +V_A &= \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t p_A y^A_t = \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t (p_A \, x_{A0}\, \gamma_A^t)\\ +V_B &= \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t p_B y^B_t = \sum_{t=0}^{T-1} \left(\frac{1}{1 + r}\right)^t (p_B \, x_{B0}\, \gamma_B^t)\\ +V &= V_A + V_B +\end{aligned} +$$ + +Let's compute the value of each of these assets using the dot product. + +The first question to ask yourself is: "For which two vectors should I compute the dot product?" + +It turns out that this depends on which two vectors you'd like to create. + +One reasonable choice is presented in the code below. + +```{code-cell} python +# Depreciation of production rates +gamma_A = 0.80 +gamma_B = 0.90 + +# Interest rate discounting +r = 0.05 +discount = np.array([(1 / (1+r))**t for t in range(20)]) + +# Let's first create arrays that have the production of each oilfield +oil_A = 5 * np.array([gamma_A**t for t in range(20)]) +oil_B = 2 * np.array([gamma_B**t for t in range(20)]) +oilfields = np.array([oil_A, oil_B]) + +# Use matrix multiplication to get discounted sum of oilfield values and then sum +# the two values +Vs = oilfields @ discount + +print(f"The npv of oilfields is {Vs.sum()}") +``` + +Now consider the approximation where instead of the oilfields having a finite lifetime of 20 years, +we let them produce forever, i.e. $T = \infty$. + +With a little algebra, + +$$ +V_A = p_A \sum_{t=0}^{\infty}\left(\frac{1}{1 + r}\right)^t (x_{A0} \gamma_A^t) = x_{A0}\sum_{t=0}^{\infty}\left(\frac{\gamma_A}{1 + r}\right)^t +$$ + +And, using the infinite sum formula from {doc}`Control Flow <../python_fundamentals/control_flow>` (i.e. $\sum_{t=0}^{\infty}\beta^t = (1 - \beta)^{-1}$) + +$$ += \frac{p_A x_{A0}}{1 - \left(\gamma_A\frac{1}{1 + r} \right)} +$$ + +The $V_B$ is defined symmetrically. + +How different is this infinite horizon approximation from the $T = 20$ version, and why? + +Now, let's compute the $T = \infty$ version of the net present value and make a graph to help +us see how many periods are needed to approach the infinite horizon value. + +```{code-cell} python +# Depreciation of production rates +gamma_A = 0.80 +gamma_B = 0.90 + +# Interest rate discounting +r = 0.05 + + +def infhor_NPV_oilfield(starting_output, gamma, r): + beta = gamma / (1 + r) + return starting_output / (1 - beta) + + +def compute_NPV_oilfield(starting_output, gamma, r, T): + outputs = starting_output * np.array([gamma**t for t in range(T)]) + discount = np.array([(1 / (1+r))**t for t in range(T)]) + + npv = np.dot(outputs, discount) + + return npv + +Ts = np.arange(2, 75) + +NPVs_A = np.array([compute_NPV_oilfield(5, gamma_A, r, t) for t in Ts]) +NPVs_B = np.array([compute_NPV_oilfield(2, gamma_B, r, t) for t in Ts]) + +NPVs_T = NPVs_A + NPVs_B +NPV_oo = infhor_NPV_oilfield(5, gamma_A, r) + infhor_NPV_oilfield(2, gamma_B, r) + +fig, ax = plt.subplots() + +ax.set_title("NPV with Varying T") +ax.set_ylabel("NPV") + +ax.plot(Ts, NPVs_A + NPVs_B) +ax.hlines(NPV_oo, Ts[0], Ts[-1], color="k", linestyle="--") # Plot infinite horizon value + +ax.spines["right"].set_visible(False) +ax.spines["top"].set_visible(False) +``` + +It is also worth noting that the computation of the infinite horizon net present value can be +simplified even further by using matrix multiplication. That is, the formula given above is +equivalent to + +$$ +V = \begin{bmatrix}p_A & p_B \end{bmatrix} \cdot \sum_{t=0}^{\infty} \left(\left(\frac{1}{1 + r}\right)^t \begin{bmatrix} \gamma_A & 0 \\ 0 & \gamma_B \end{bmatrix}^t \cdot x_0\right) +$$ + +and where $x_0 = \begin{bmatrix} x_{A0} \\ x_{B0} \end{bmatrix}$. + +We recognize that this equation is of the form + +$$ +V = G \sum_{t=0}^{\infty} \left(\frac{1}{1 + r}\right)^t A^t x_0 +$$ + +Without proof, and given important assumptions on $\frac{1}{1 + r}$ and $A$, this +equation reduces to + +```{math} +:label: eq_deterministic_asset_pricing + +V = G \left(I - \frac{1}{1+r} A\right)^{-1} x_0 +``` + +Using the matrix inverse, where `I` is the identity matrix. + +```{code-cell} python +p_A = 1.0 +p_B = 1.0 +G = np.array([p_A, p_B]) + +r = 0.05 +beta = 1 / (1 + r) + +gamma_A = 0.80 +gamma_B = 0.90 +A = np.array([[gamma_A, 0], [0, gamma_B]]) + +x_0 = np.array([5, 2]) + +# Compute with matrix formula +NPV_mf = G @ np.linalg.inv(np.eye(2) - beta*A) @ x_0 + +print(NPV_mf) +``` + +Note: While our matrix above was very simple, this approach works for much more +complicated `A` matrices as long as we can write $x_t$ using $A$ and $x_0$ as +$x_t = A^t x_0$ (For an advanced description of this topic, adding randomness, read about +linear state-space models with Python ). + +### Unemployment Dynamics + +Consider an economy where in any given year, $\alpha = 5\%$ of workers lose their jobs and +$\phi = 10\%$ of unemployed workers find jobs. + +Define the vector $x_0 = \begin{bmatrix} 900,000 & 100,000 \end{bmatrix}$ as the number of +employed and unemployed workers (respectively) at time $0$ in the economy. + +Our goal is to determine the dynamics of unemployment in this economy. + +First, let's define the matrix. + +$$ +A = \begin{bmatrix} 1 - \alpha & \alpha \\ \phi & 1 - \phi \end{bmatrix} +$$ + +Note that with this definition, we can describe the evolution of employment and unemployment +from $x_0$ to $x_1$ using linear algebra. + +$$ +x_1 = \begin{bmatrix} (1 - \alpha) 900,000 + \phi 100,000 \\ \alpha 900,000 + (1-\phi) 100,000\end{bmatrix} = A' x_0 +$$ + +However, since the transitions do not change over time, we can use this to describe the evolution +from any arbitrary time $t$, so that + +$$ +x_{t+1} = A' x_t +$$ + +Let's code up a python function that will let us track the evolution of unemployment over time. + +```{code-cell} python +phi = 0.1 +alpha = 0.05 + +x0 = np.array([900_000, 100_000]) + +A = np.array([[1-alpha, alpha], [phi, 1-phi]]) + +def simulate(x0, A, T=10): + """ + Simulate the dynamics of unemployment for T periods starting from x0 + and using values of A for probabilities of moving between employment + and unemployment + """ + nX = x0.shape[0] + out = np.zeros((T, nX)) + out[0, :] = x0 + + for t in range(1, T): + out[t, :] = A.T @ out[t-1, :] + + return out +``` + +Let's use this function to plot unemployment and employment levels for 10 periods. + +```{code-cell} python +def plot_simulation(x0, A, T=100): + X = simulate(x0, A, T) + fig, ax = plt.subplots() + ax.plot(X[:, 0]) + ax.plot(X[:, 1]) + ax.set_xlabel("t") + ax.legend(["Employed", "Unemployed"]) + return ax + +plot_simulation(x0, A, 50) +``` + +Notice that the levels of unemployed an employed workers seem to be heading to constant numbers. + +We refer to this phenomenon as *convergence* because the values appear to converge to a constant +number. + +Let's check that the values are permanently converging. + +```{code-cell} python +plot_simulation(x0, A, 5000) +``` + +The convergence of this system is a property determined by the matrix $A$. + +The long-run distribution of employed and unemployed workers is equal to the largest [eigenvector](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) +of $A'$, corresponding to the eigenvalue equal to 1. An eigenvalue of $A'$ is also known as a "left-eigenvector" of A. + +Let's have numpy compute the eigenvalues and eigenvectors and compare the results to our simulated results above: + +```{code-cell} python +eigvals, eigvecs = np.linalg.eig(A.T) +for i in range(len(eigvals)): + if eigvals[i] == 1: + which_eig = i + break + +print(f"We are looking for eigenvalue {which_eig}") +``` + +Now let's look at the corresponding eigenvector: + +```{code-cell} python +dist = eigvecs[:, which_eig] + +# need to divide by sum so it adds to 1 +dist /= dist.sum() + +print(f"The distribution of workers is given by {dist}") +``` + + +````{admonition} Exercise +:name: dir3-3-3 + +See exercise 3 in the {ref}`exercise list `. +```` + +(ex3-3)= +## Exercises + +### Exercise 1 + +Alice is a stock broker who owns two types of assets: A and B. She owns 100 +units of asset A and 50 units of asset B. The current interest rate is 5%. +Each of the A assets have a remaining duration of 6 years and pay +\$1500 each year, while each of the B assets have a remaining duration +of 4 years and pay \$500 each year. Alice would like to retire if she +can sell her assets for more than \$500,000. Use vector addition, scalar +multiplication, and dot products to determine whether she can retire. + +({ref}`back to text `) + +### Exercise 2 + +Which of the following operations will work and which will +create errors because of size issues? + +Test out your intuitions in the code cell below + +```{code-block} python +x1 @ x2 +x2 @ x1 +x2 @ x3 +x3 @ x2 +x1 @ x3 +x4 @ y1 +x4 @ y2 +y1 @ x4 +y2 @ x4 +``` + +```{code-cell} python +# testing area +``` + +({ref}`back to text `) + +### Exercise 3 + +Compare the distribution above to the final values of a long simulation. + +If you multiply the distribution by 1,000,000 (the number of workers), do you get (roughly) the same number as the simulation? + +```{code-cell} python +# your code here +``` + +({ref}`back to text `) diff --git a/lectures/scientific/index.md b/lectures/scientific/index.md index d341f041..2cf55671 100644 --- a/lectures/scientific/index.md +++ b/lectures/scientific/index.md @@ -1,46 +1,46 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Scientific Computing - -This section discusses several key aspects of scientific computing that enable modern economics, data science, and statistics. - -As the size of our data and the complexity of our models have increased (and continue doing so), we have become more reliant on computers to perform computations that we simply cannot do by hand. - -In this section, we will cover - -- Python's main numerical library numpy and how to work with its array type. -- A basic introduction to visualizing data with matplotlib. -- A refresher on some key linear algebra concepts. -- A review of basic probability concepts and how to use simulation in learning economics. -- Using a computer to perform optimization. - -Many of the tools learned in this section will continue to show up throughout the -{doc}`pandas <../pandas/index>` and {doc}`applications <../applications/index>` sections. - -```{warning} -This section has more formal math than the previous material (and there will be more -math as you cover certain methods). - -We expect that students' mathematical backgrounds will range widely, so for those who have slightly less preparation, please don't let this scare you. - -We have found that although understanding these tools will require some extra effort, it will give you a leg up in almost any career you might consider. -``` - -## [Introduction to Numpy](../scientific/numpy_arrays.md) - -## [Plotting](../scientific/plotting.md) - -## [Applied Linear Algebra](../scientific/applied_linalg.md) - -## [Randomness](../scientific/randomness.md) - +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Scientific Computing + +This section discusses several key aspects of scientific computing that enable modern economics, data science, and statistics. + +As the size of our data and the complexity of our models have increased (and continue doing so), we have become more reliant on computers to perform computations that we simply cannot do by hand. + +In this section, we will cover + +- Python's main numerical library numpy and how to work with its array type. +- A basic introduction to visualizing data with matplotlib. +- A refresher on some key linear algebra concepts. +- A review of basic probability concepts and how to use simulation in learning economics. +- Using a computer to perform optimization. + +Many of the tools learned in this section will continue to show up throughout the +{doc}`pandas <../pandas/index>` and {doc}`applications <../applications/index>` sections. + +```{warning} +This section has more formal math than the previous material (and there will be more +math as you cover certain methods). + +We expect that students' mathematical backgrounds will range widely, so for those who have slightly less preparation, please don't let this scare you. + +We have found that although understanding these tools will require some extra effort, it will give you a leg up in almost any career you might consider. +``` + +## [Introduction to Numpy](../scientific/numpy_arrays.md) + +## [Plotting](../scientific/plotting.md) + +## [Applied Linear Algebra](../scientific/applied_linalg.md) + +## [Randomness](../scientific/randomness.md) + ## [Optimization](../scientific/optimization.md) \ No newline at end of file diff --git a/lectures/scientific/numpy_arrays.md b/lectures/scientific/numpy_arrays.md index 65883f80..a01addc6 100644 --- a/lectures/scientific/numpy_arrays.md +++ b/lectures/scientific/numpy_arrays.md @@ -1,552 +1,552 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Introduction to Numpy - -**Prerequisites** - -- {doc}`Python Fundamentals <../python_fundamentals/index>` - -**Outcomes** - -- Understand basics about numpy arrays -- Index into multi-dimensional arrays -- Use universal functions/broadcasting to do element-wise operations on arrays - - -## Numpy Arrays - -Now that we have learned the fundamentals of programming in Python, we will learn how we can use Python -to perform the computations required in data science and economics. We call these the "scientific Python tools". - -The foundational library that helps us perform these computations is known as `numpy` (numerical -Python). - -Numpy's core contribution is a new data-type called an *array*. - -An array is similar to a list, but numpy imposes some additional restrictions on how the data inside is organized. - -These restrictions allow numpy to - -1. Be more efficient in performing mathematical and scientific computations. -1. Expose functions that allow numpy to do the necessary linear algebra for machine learning and statistics. - -Before we get started, please note that the convention for importing the numpy package is to use the -nickname `np`: - -```{code-cell} python -import numpy as np -``` - -### What is an Array? - -An array is a multi-dimensional grid of values. - -What does this mean? It is easier to demonstrate than to explain. - -In this block of code, we build a 1-dimensional array. - -```{code-cell} python -# create an array from a list -x_1d = np.array([1, 2, 3]) -print(x_1d) -``` - -You can think of a 1-dimensional array as a list of numbers. - -```{code-cell} python -# We can index like we did with lists -print(x_1d[0]) -print(x_1d[0:2]) -``` - -Note that the range of indices does not include the end-point, that -is - -```{code-cell} python -print(x_1d[0:3] == x_1d[:]) -print(x_1d[0:2]) -``` - -The differences emerge as we move into higher dimensions. - -Next, we define a 2-dimensional array (a matrix) - -```{code-cell} python -x_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) -print(x_2d) -``` - -Notice that the data is no longer represented as something flat, but rather, -as three rows and three columns of numbers. - -The first question that you might ask yourself is: "how do I access the values in this array?" - -You access each element by specifying a row first and then a column. For -example, if we wanted to access the `6`, we would ask for the (1, 2) element. - -```{code-cell} python -print(x_2d[1, 2]) # Indexing into two dimensions! -``` - -Or to get the top left corner... - -```{code-cell} python -print(x_2d[0, 0]) # Indexing into two dimensions! -``` - -To get the first, and then second rows... - -```{code-cell} python -print(x_2d[0, :]) -print(x_2d[1, :]) -``` - -Or the columns... - -```{code-cell} python -print(x_2d[:, 0]) -print(x_2d[:, 1]) -``` - -This continues to generalize, since numpy gives us as many dimensions as we want in an array. - -For example, we build a 3-dimensional array below. - -```{code-cell} python -x_3d_list = [[[1, 2, 3], [4, 5, 6]], [[10, 20, 30], [40, 50, 60]]] -x_3d = np.array(x_3d_list) -print(x_3d) -``` - -#### Array Indexing - -Now that there are multiple dimensions, indexing might feel somewhat non-obvious. - -Do the rows or columns come first? In higher dimensions, what is the order of -the index? - -Notice that the array is built using a list of lists (you could also use tuples!). - -Indexing into the array will correspond to choosing elements from each list. - -First, notice that the dimensions give two stacked matrices, which we can access with - -```{code-cell} python -print(x_3d[0]) -print(x_3d[1]) -``` - -In the case of the first, it is synonymous with - -```{code-cell} python -print(x_3d[0, :, :]) -``` - -Let's work through another example to further clarify this concept with our -3-dimensional array. - -Our goal will be to find the index that retrieves the `4` out of `x_3d`. - -Recall that when we created `x_3d`, we used the list `[[[1, 2, 3], [4, 5, 6]], [[10, 20, 30], [40, 50, 60]]]`. - -Notice that the 0 element of that list is `[[1, 2, 3], [4, 5, 6]]`. This is the -list that contains the `4` so the first index we would use is a 0. - -```{code-cell} python -print(f"The 0 element is {x_3d_list[0]}") -print(f"The 1 element is {x_3d_list[1]}") -``` - -We then move to the next lists which were the 0 element of the inner-most dimension. Notice that -the two lists at this level `[1, 2, 3]` and `[3, 4, 5]`. - -The 4 is in the second 1 element (index `1`), so the second index we would choose is 1. - -```{code-cell} python -print(f"The 0 element of the 0 element is {x_3d_list[0][0]}") -print(f"The 1 element of the 0 element is {x_3d_list[0][1]}") -``` - -Finally, we move to the outer-most dimension, which has a list of numbers -`[4, 5, 6]`. - -The 4 is element 0 of this list, so the third, or outer-most index, would be `0`. - -```{code-cell} python -print(f"The 0 element of the 1 element of the 0 element is {x_3d_list[0][1][0]}") -``` - -Now we can use these same indices to index into the array. With an array, we can index using a single operation rather than repeated indexing as we did with the list `x_3d_list[0][1][0]`. - -Let's test it to see whether we did it correctly! - -```{code-cell} python -print(x_3d[0, 1, 0]) -``` - -Success! - -````{admonition} Exercise -:name: dir3-1-1 - -See exercise 1 in the {ref}`exercise list `. -```` - -````{admonition} Exercise -:name: dir3-1-2 - -See exercise 2 in the {ref}`exercise list `. -```` - -We can also select multiple elements at a time -- this is called slicing. - -If we wanted to have an array with just `[1, 2, 3]` then we would do - -```{code-cell} python -print(x_3d[0, 0, :]) -``` - -Notice that we put a `:` on the dimension where we want to select all of the elements. We can also -slice out subsets of the elements by doing `start:stop+1`. - -Notice how the following arrays differ. - -```{code-cell} python -print(x_3d[:, 0, :]) -print(x_3d[:, 0, 0:2]) -print(x_3d[:, 0, :2]) # the 0 in 0:2 is optional -``` - -````{admonition} Exercise -:name: dir3-1-3 - -See exercise 3 in the {ref}`exercise list `. -```` - - -### Array Functionality - -#### Array Properties - -All numpy arrays have various useful properties. - -Properties are similar to methods in that they're accessed through -the "dot notation." However, they aren't a function so we don't need parentheses. - -The two most frequently used properties are `shape` and `dtype`. - -`shape` tells us how many elements are in each array dimension. - -`dtype` tells us the types of an array's elements. - -Let's do some examples to see these properties in action. - -```{code-cell} python -x = np.array([[1, 2, 3], [4, 5, 6]]) -print(x.shape) -print(x.dtype) -``` - -We'll use this to practice unpacking a tuple, like `x.shape`, directly into variables. - -```{code-cell} python -rows, columns = x.shape -print(f"rows = {rows}, columns = {columns}") -``` - -```{code-cell} python -x = np.array([True, False, True]) -print(x.shape) -print(x.dtype) -``` - -Note that in the above, the `(3,)` represents a tuple of length 1, distinct from a scalar integer `3`. - -```{code-cell} python -x = np.array([ - [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], - [[7.0, 8.0], [9.0, 10.0], [11.0, 12.0]] -]) -print(x.shape) -print(x.dtype) -``` - -#### Creating Arrays - -It's usually impractical to define arrays by hand as we have done so far. - -We'll often need to create an array with default values and then fill it -with other values. - -We can create arrays with the functions `np.zeros` and `np.ones`. - -Both functions take a tuple that denotes the shape of an array and creates an -array filled with 0s or 1s respectively. - -```{code-cell} python -sizes = (2, 3, 4) -x = np.zeros(sizes) # note, a tuple! -x -``` - -```{code-cell} python -y = np.ones((4)) -y -``` - -#### Broadcasting Operations - -Two types of operations that will be useful for arrays of any dimension are: - -1. Operations between an array and a single number. -1. Operations between two arrays of the same shape. - -When we perform operations on an array by using a single number, we simply apply that operation to every element of the array. - -```{code-cell} python -# Using np.ones to create an array -x = np.ones((2, 2)) -print("x = ", x) -print("2 + x = ", 2 + x) -print("2 - x = ", 2 - x) -print("2 * x = ", 2 * x) -print("x / 2 = ", x / 2) -``` - - -````{admonition} Exercise -:name: dir3-1-4 - -See exercise 4 in the {ref}`exercise list `. -```` - -Operations between two arrays of the same size, in this case `(2, 2)`, simply apply the operation -element-wise between the arrays. - -```{code-cell} python -x = np.array([[1.0, 2.0], [3.0, 4.0]]) -y = np.ones((2, 2)) -print("x = ", x) -print("y = ", y) -print("x + y = ", x + y) -print("x - y", x - y) -print("(elementwise) x * y = ", x * y) -print("(elementwise) x / y = ", x / y) -``` - -### Universal Functions - -We will often need to transform data by applying a function to every element of an array. - -Numpy has good support for these operations, called *universal functions* or ufuncs for short. - -The -[numpy documentation](https://docs.scipy.org/doc/numpy/reference/ufuncs.html?highlight=ufunc#available-ufuncs) -has a list of all available ufuncs. - -```{note} -You should think of operations between a single number and an array, as we -just saw, as a ufunc. -``` - -Below, we will create an array that contains 10 points between 0 and 25. - -```{code-cell} python -# This is similar to range -- but spits out 50 evenly spaced points from 0.5 -# to 25. -x = np.linspace(0.5, 25, 10) -``` - -We will experiment with some ufuncs below: - -```{code-cell} python -# Applies the sin function to each element of x -np.sin(x) -``` - -Of course, we could do the same thing with a comprehension, but -the code would be both less readable and less efficient. - -```{code-cell} python -np.array([np.sin(xval) for xval in x]) -``` - -You can use the inspector or the docstrings with `np.` to see other available functions, such as - -```{code-cell} python -# Takes log of each element of x -np.log(x) -``` - -A benefit of using the numpy arrays is that numpy has succinct code for combining vectorized operations. - -```{code-cell} python -# Calculate log(z) * z elementwise -z = np.array([1,2,3]) -np.log(z) * z -``` - -````{admonition} Exercise -:name: dir3-1-5 - -See exercise 5 in the {ref}`exercise list `. -```` - -### Other Useful Array Operations - -We have barely scratched the surface of what is possible using numpy arrays. - -We hope you will experiment with other functions from numpy and see how they -work. - -Below, we demonstrate a few more array operations that we find most useful -- just to give you an idea -of what else you might find. - -When you're attempting to do an operation that you feel should be common, the numpy library probably has it. - -Use Google and tab completion to check this. - -```{code-cell} python -x = np.linspace(0, 25, 10) -``` - -```{code-cell} python -np.mean(x) -``` - -```{code-cell} python -np.std(x) -``` - -```{code-cell} python -# np.min, np.median, etc... are also defined -np.max(x) -``` - -```{code-cell} python -np.diff(x) -``` - -```{code-cell} python -np.reshape(x, (5, 2)) -``` - -Note that many of these operations can be called as methods on `x`: - -```{code-cell} python -print(x.mean()) -print(x.std()) -print(x.max()) -# print(x.diff()) # this one is not a method... -print(x.reshape((5, 2))) -``` - -Finally, `np.vectorize` can be conveniently used with numpy broadcasting and any functions. - -```{code-cell} python -np.random.seed(42) -x = np.random.rand(10) -print(x) - -def f(val): - if val < 0.3: - return "low" - else: - return "high" - -print(f(0.1)) # scalar, no problem -# f(x) # array, fails since f() is scalar -f_vec = np.vectorize(f) -print(f_vec(x)) -``` - -Caution: `np.vectorize` is convenient for numpy broadcasting with any function -but is not intended to be high performance. - -When speed matters, directly write a `f` function to work on arrays. - -(ex3-1)= -## Exercises - -### Exercise 1 - -Try indexing into another element of your choice from the -3-dimensional array. - -Building an understanding of indexing means working through this -type of operation several times -- without skipping steps! - -({ref}`back to text `) - -### Exercise 2 - -Look at the 2-dimensional array `x_2d`. - -Does the inner-most index correspond to rows or columns? What does the -outer-most index correspond to? - -Write your thoughts. - -({ref}`back to text `) - -### Exercise 3 - -What would you do to extract the array `[[5, 6], [50, 60]]`? - -({ref}`back to text `) - -### Exercise 4 - -Do you recall what multiplication by an integer did for lists? - -How does this differ? - -({ref}`back to text `) - -### Exercise 5 - -Let's revisit a bond pricing example we saw in {doc}`Control flow <../python_fundamentals/control_flow>`. - -Recall that the equation for pricing a bond with coupon payment $C$, -face value $M$, yield to maturity $i$, and periods to maturity -$N$ is - -$$ -\begin{align*} - P &= \left(\sum_{n=1}^N \frac{C}{(i+1)^n}\right) + \frac{M}{(1+i)^N} \\ - &= C \left(\frac{1 - (1+i)^{-N}}{i} \right) + M(1+i)^{-N} -\end{align*} -$$ - -In the code cell below, we have defined variables for `i`, `M` and `C`. - -You have two tasks: - -1. Define a numpy array `N` that contains all maturities between 1 and 10 - - ```{hint} - look at the `np.arange` function. - ``` - -1. Using the equation above, determine the bond prices of all maturity levels in your array. - -```{code-cell} python -i = 0.03 -M = 100 -C = 5 - -# Define array here - -# price bonds here -``` - +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Introduction to Numpy + +**Prerequisites** + +- {doc}`Python Fundamentals <../python_fundamentals/index>` + +**Outcomes** + +- Understand basics about numpy arrays +- Index into multi-dimensional arrays +- Use universal functions/broadcasting to do element-wise operations on arrays + + +## Numpy Arrays + +Now that we have learned the fundamentals of programming in Python, we will learn how we can use Python +to perform the computations required in data science and economics. We call these the "scientific Python tools". + +The foundational library that helps us perform these computations is known as `numpy` (numerical +Python). + +Numpy's core contribution is a new data-type called an *array*. + +An array is similar to a list, but numpy imposes some additional restrictions on how the data inside is organized. + +These restrictions allow numpy to + +1. Be more efficient in performing mathematical and scientific computations. +1. Expose functions that allow numpy to do the necessary linear algebra for machine learning and statistics. + +Before we get started, please note that the convention for importing the numpy package is to use the +nickname `np`: + +```{code-cell} python +import numpy as np +``` + +### What is an Array? + +An array is a multi-dimensional grid of values. + +What does this mean? It is easier to demonstrate than to explain. + +In this block of code, we build a 1-dimensional array. + +```{code-cell} python +# create an array from a list +x_1d = np.array([1, 2, 3]) +print(x_1d) +``` + +You can think of a 1-dimensional array as a list of numbers. + +```{code-cell} python +# We can index like we did with lists +print(x_1d[0]) +print(x_1d[0:2]) +``` + +Note that the range of indices does not include the end-point, that +is + +```{code-cell} python +print(x_1d[0:3] == x_1d[:]) +print(x_1d[0:2]) +``` + +The differences emerge as we move into higher dimensions. + +Next, we define a 2-dimensional array (a matrix) + +```{code-cell} python +x_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) +print(x_2d) +``` + +Notice that the data is no longer represented as something flat, but rather, +as three rows and three columns of numbers. + +The first question that you might ask yourself is: "how do I access the values in this array?" + +You access each element by specifying a row first and then a column. For +example, if we wanted to access the `6`, we would ask for the (1, 2) element. + +```{code-cell} python +print(x_2d[1, 2]) # Indexing into two dimensions! +``` + +Or to get the top left corner... + +```{code-cell} python +print(x_2d[0, 0]) # Indexing into two dimensions! +``` + +To get the first, and then second rows... + +```{code-cell} python +print(x_2d[0, :]) +print(x_2d[1, :]) +``` + +Or the columns... + +```{code-cell} python +print(x_2d[:, 0]) +print(x_2d[:, 1]) +``` + +This continues to generalize, since numpy gives us as many dimensions as we want in an array. + +For example, we build a 3-dimensional array below. + +```{code-cell} python +x_3d_list = [[[1, 2, 3], [4, 5, 6]], [[10, 20, 30], [40, 50, 60]]] +x_3d = np.array(x_3d_list) +print(x_3d) +``` + +#### Array Indexing + +Now that there are multiple dimensions, indexing might feel somewhat non-obvious. + +Do the rows or columns come first? In higher dimensions, what is the order of +the index? + +Notice that the array is built using a list of lists (you could also use tuples!). + +Indexing into the array will correspond to choosing elements from each list. + +First, notice that the dimensions give two stacked matrices, which we can access with + +```{code-cell} python +print(x_3d[0]) +print(x_3d[1]) +``` + +In the case of the first, it is synonymous with + +```{code-cell} python +print(x_3d[0, :, :]) +``` + +Let's work through another example to further clarify this concept with our +3-dimensional array. + +Our goal will be to find the index that retrieves the `4` out of `x_3d`. + +Recall that when we created `x_3d`, we used the list `[[[1, 2, 3], [4, 5, 6]], [[10, 20, 30], [40, 50, 60]]]`. + +Notice that the 0 element of that list is `[[1, 2, 3], [4, 5, 6]]`. This is the +list that contains the `4` so the first index we would use is a 0. + +```{code-cell} python +print(f"The 0 element is {x_3d_list[0]}") +print(f"The 1 element is {x_3d_list[1]}") +``` + +We then move to the next lists which were the 0 element of the inner-most dimension. Notice that +the two lists at this level `[1, 2, 3]` and `[3, 4, 5]`. + +The 4 is in the second 1 element (index `1`), so the second index we would choose is 1. + +```{code-cell} python +print(f"The 0 element of the 0 element is {x_3d_list[0][0]}") +print(f"The 1 element of the 0 element is {x_3d_list[0][1]}") +``` + +Finally, we move to the outer-most dimension, which has a list of numbers +`[4, 5, 6]`. + +The 4 is element 0 of this list, so the third, or outer-most index, would be `0`. + +```{code-cell} python +print(f"The 0 element of the 1 element of the 0 element is {x_3d_list[0][1][0]}") +``` + +Now we can use these same indices to index into the array. With an array, we can index using a single operation rather than repeated indexing as we did with the list `x_3d_list[0][1][0]`. + +Let's test it to see whether we did it correctly! + +```{code-cell} python +print(x_3d[0, 1, 0]) +``` + +Success! + +````{admonition} Exercise +:name: dir3-1-1 + +See exercise 1 in the {ref}`exercise list `. +```` + +````{admonition} Exercise +:name: dir3-1-2 + +See exercise 2 in the {ref}`exercise list `. +```` + +We can also select multiple elements at a time -- this is called slicing. + +If we wanted to have an array with just `[1, 2, 3]` then we would do + +```{code-cell} python +print(x_3d[0, 0, :]) +``` + +Notice that we put a `:` on the dimension where we want to select all of the elements. We can also +slice out subsets of the elements by doing `start:stop+1`. + +Notice how the following arrays differ. + +```{code-cell} python +print(x_3d[:, 0, :]) +print(x_3d[:, 0, 0:2]) +print(x_3d[:, 0, :2]) # the 0 in 0:2 is optional +``` + +````{admonition} Exercise +:name: dir3-1-3 + +See exercise 3 in the {ref}`exercise list `. +```` + + +### Array Functionality + +#### Array Properties + +All numpy arrays have various useful properties. + +Properties are similar to methods in that they're accessed through +the "dot notation." However, they aren't a function so we don't need parentheses. + +The two most frequently used properties are `shape` and `dtype`. + +`shape` tells us how many elements are in each array dimension. + +`dtype` tells us the types of an array's elements. + +Let's do some examples to see these properties in action. + +```{code-cell} python +x = np.array([[1, 2, 3], [4, 5, 6]]) +print(x.shape) +print(x.dtype) +``` + +We'll use this to practice unpacking a tuple, like `x.shape`, directly into variables. + +```{code-cell} python +rows, columns = x.shape +print(f"rows = {rows}, columns = {columns}") +``` + +```{code-cell} python +x = np.array([True, False, True]) +print(x.shape) +print(x.dtype) +``` + +Note that in the above, the `(3,)` represents a tuple of length 1, distinct from a scalar integer `3`. + +```{code-cell} python +x = np.array([ + [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], + [[7.0, 8.0], [9.0, 10.0], [11.0, 12.0]] +]) +print(x.shape) +print(x.dtype) +``` + +#### Creating Arrays + +It's usually impractical to define arrays by hand as we have done so far. + +We'll often need to create an array with default values and then fill it +with other values. + +We can create arrays with the functions `np.zeros` and `np.ones`. + +Both functions take a tuple that denotes the shape of an array and creates an +array filled with 0s or 1s respectively. + +```{code-cell} python +sizes = (2, 3, 4) +x = np.zeros(sizes) # note, a tuple! +x +``` + +```{code-cell} python +y = np.ones((4)) +y +``` + +#### Broadcasting Operations + +Two types of operations that will be useful for arrays of any dimension are: + +1. Operations between an array and a single number. +1. Operations between two arrays of the same shape. + +When we perform operations on an array by using a single number, we simply apply that operation to every element of the array. + +```{code-cell} python +# Using np.ones to create an array +x = np.ones((2, 2)) +print("x = ", x) +print("2 + x = ", 2 + x) +print("2 - x = ", 2 - x) +print("2 * x = ", 2 * x) +print("x / 2 = ", x / 2) +``` + + +````{admonition} Exercise +:name: dir3-1-4 + +See exercise 4 in the {ref}`exercise list `. +```` + +Operations between two arrays of the same size, in this case `(2, 2)`, simply apply the operation +element-wise between the arrays. + +```{code-cell} python +x = np.array([[1.0, 2.0], [3.0, 4.0]]) +y = np.ones((2, 2)) +print("x = ", x) +print("y = ", y) +print("x + y = ", x + y) +print("x - y", x - y) +print("(elementwise) x * y = ", x * y) +print("(elementwise) x / y = ", x / y) +``` + +### Universal Functions + +We will often need to transform data by applying a function to every element of an array. + +Numpy has good support for these operations, called *universal functions* or ufuncs for short. + +The +[numpy documentation](https://docs.scipy.org/doc/numpy/reference/ufuncs.html?highlight=ufunc#available-ufuncs) +has a list of all available ufuncs. + +```{note} +You should think of operations between a single number and an array, as we +just saw, as a ufunc. +``` + +Below, we will create an array that contains 10 points between 0 and 25. + +```{code-cell} python +# This is similar to range -- but spits out 50 evenly spaced points from 0.5 +# to 25. +x = np.linspace(0.5, 25, 10) +``` + +We will experiment with some ufuncs below: + +```{code-cell} python +# Applies the sin function to each element of x +np.sin(x) +``` + +Of course, we could do the same thing with a comprehension, but +the code would be both less readable and less efficient. + +```{code-cell} python +np.array([np.sin(xval) for xval in x]) +``` + +You can use the inspector or the docstrings with `np.` to see other available functions, such as + +```{code-cell} python +# Takes log of each element of x +np.log(x) +``` + +A benefit of using the numpy arrays is that numpy has succinct code for combining vectorized operations. + +```{code-cell} python +# Calculate log(z) * z elementwise +z = np.array([1,2,3]) +np.log(z) * z +``` + +````{admonition} Exercise +:name: dir3-1-5 + +See exercise 5 in the {ref}`exercise list `. +```` + +### Other Useful Array Operations + +We have barely scratched the surface of what is possible using numpy arrays. + +We hope you will experiment with other functions from numpy and see how they +work. + +Below, we demonstrate a few more array operations that we find most useful -- just to give you an idea +of what else you might find. + +When you're attempting to do an operation that you feel should be common, the numpy library probably has it. + +Use Google and tab completion to check this. + +```{code-cell} python +x = np.linspace(0, 25, 10) +``` + +```{code-cell} python +np.mean(x) +``` + +```{code-cell} python +np.std(x) +``` + +```{code-cell} python +# np.min, np.median, etc... are also defined +np.max(x) +``` + +```{code-cell} python +np.diff(x) +``` + +```{code-cell} python +np.reshape(x, (5, 2)) +``` + +Note that many of these operations can be called as methods on `x`: + +```{code-cell} python +print(x.mean()) +print(x.std()) +print(x.max()) +# print(x.diff()) # this one is not a method... +print(x.reshape((5, 2))) +``` + +Finally, `np.vectorize` can be conveniently used with numpy broadcasting and any functions. + +```{code-cell} python +np.random.seed(42) +x = np.random.rand(10) +print(x) + +def f(val): + if val < 0.3: + return "low" + else: + return "high" + +print(f(0.1)) # scalar, no problem +# f(x) # array, fails since f() is scalar +f_vec = np.vectorize(f) +print(f_vec(x)) +``` + +Caution: `np.vectorize` is convenient for numpy broadcasting with any function +but is not intended to be high performance. + +When speed matters, directly write a `f` function to work on arrays. + +(ex3-1)= +## Exercises + +### Exercise 1 + +Try indexing into another element of your choice from the +3-dimensional array. + +Building an understanding of indexing means working through this +type of operation several times -- without skipping steps! + +({ref}`back to text `) + +### Exercise 2 + +Look at the 2-dimensional array `x_2d`. + +Does the inner-most index correspond to rows or columns? What does the +outer-most index correspond to? + +Write your thoughts. + +({ref}`back to text `) + +### Exercise 3 + +What would you do to extract the array `[[5, 6], [50, 60]]`? + +({ref}`back to text `) + +### Exercise 4 + +Do you recall what multiplication by an integer did for lists? + +How does this differ? + +({ref}`back to text `) + +### Exercise 5 + +Let's revisit a bond pricing example we saw in {doc}`Control flow <../python_fundamentals/control_flow>`. + +Recall that the equation for pricing a bond with coupon payment $C$, +face value $M$, yield to maturity $i$, and periods to maturity +$N$ is + +$$ +\begin{align*} + P &= \left(\sum_{n=1}^N \frac{C}{(i+1)^n}\right) + \frac{M}{(1+i)^N} \\ + &= C \left(\frac{1 - (1+i)^{-N}}{i} \right) + M(1+i)^{-N} +\end{align*} +$$ + +In the code cell below, we have defined variables for `i`, `M` and `C`. + +You have two tasks: + +1. Define a numpy array `N` that contains all maturities between 1 and 10 + + ```{hint} + look at the `np.arange` function. + ``` + +1. Using the equation above, determine the bond prices of all maturity levels in your array. + +```{code-cell} python +i = 0.03 +M = 100 +C = 5 + +# Define array here + +# price bonds here +``` + ({ref}`back to text `) \ No newline at end of file diff --git a/lectures/scientific/optimization.md b/lectures/scientific/optimization.md index 47a73ff7..18f1016d 100644 --- a/lectures/scientific/optimization.md +++ b/lectures/scientific/optimization.md @@ -1,464 +1,464 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Optimization - -**Prerequisites** - -- {doc}`Introduction to Numpy ` -- {doc}`Applied Linear Algebra ` - -**Outcomes** - -- Perform optimization by hand using derivatives -- Understand ideas from gradient descent - - -```{literalinclude} ../_static/colab_light.raw -``` - -```{code-cell} python -# imports for later -import numpy as np -import matplotlib.pyplot as plt -%matplotlib inline -``` - -## What is Optimization? - -Optimization is the branch of mathematics focused on finding extreme values (max or min) of -functions. - -Optimization tools will appear in many places throughout this course, including: - -- Building economic models in which individuals make decisions that maximize their utility. -- Building statistical models and maximizing the fit of these models by optimizing certain fit - functions. - -In this lecture, we will focus mostly on the first to limit the moving pieces, but in other lectures, we'll discuss the second in detail. - -### Derivatives and Optima - -Here, we revisit some of the theory that you have already learned in your calculus class. - -Consider function $f(x)$ which maps a number into another number. We can say that any point -where $f'(x) = 0$ is a local extremum of $f$. - -Let's work through an example. Consider the function - -$$ -f(x) = x^4 - 3 x^2 -$$ - -Its derivative is given by - -$$ -\frac{\partial f}{\partial x} = 4 x^3 - 6 x -$$ - -Let's plot the function and its derivative to pick out the local extremum by hand. - -```{code-cell} python -def f(x): - return x**4 - 3*x**2 - - -def fp(x): - return 4*x**3 - 6*x - -# Create 100 evenly spaced points between -2 and 2 -x = np.linspace(-2., 2., 100) - -# Evaluate the functions at x values -fx = f(x) -fpx = fp(x) - -# Create plot -fig, ax = plt.subplots(1, 2) - -ax[0].plot(x, fx) -ax[0].set_title("Function") - -ax[1].plot(x, fpx) -ax[1].hlines(0.0, -2.5, 2.5, color="k", linestyle="--") -ax[1].set_title("Derivative") - -for _ax in ax: - _ax.spines["right"].set_visible(False) - _ax.spines["top"].set_visible(False) -``` - -If you stare at this picture, you can probably determine the the local maximum is at -$x = 0$ and the local minima at $x \approx -1$ and $x \approx 1$. - -To properly determine the minima and maxima, we find the solutions to $f'(x) = 0$ below: - -$$ -f'(x) = 4 x^3 - 6 x = 0 -$$ - -$$ -\rightarrow x = \left\{0, \frac{\sqrt{6}}{2}, \frac{-\sqrt{6}}{2} \right\} -$$ - -Let's check whether we can get the same answers with Python! To do this, we import a new -package that we haven't seen yet. - -```{code-cell} python -import scipy.optimize as opt -``` - -Then using the function definitions from earlier, we search for the minimum and maximum values. - -```{code-cell} python -# For a scalar problem, we give it the function and the bounds between -# which we want to search -neg_min = opt.minimize_scalar(f, [-2, -0.5]) -pos_min = opt.minimize_scalar(f, [0.5, 2.0]) -print("The negative minimum is: \n", neg_min) -print("The positive minimum is: \n", pos_min) -``` - -The scipy optimize package only has functions that find minimums... You might be wondering, then, how we -will verify our maximum value. - -It turns out that finding the maximum is equivalent to simply finding the minimum of the negative function. - -```{code-cell} python -# Create a function that evaluates to negative f -def neg_f(x): - return -f(x) - -max_out = opt.minimize_scalar(neg_f, [-0.35, 0.35]) -print("The maximum is: \n", max_out) -``` - -We won't dive into the details of optimization algorithms in this lecture, but we'll impart some brief -intuition to help you understand the types of problems these algorithms are good at solving and -the types of problems they will struggle with: - -The general intuition is that when you're finding a maximum, an algorithm takes a step -in the direction of the derivative... (Conversely, to find a minimum, the algorithm takes a step opposite the direction of the derivative.) -This requires the function to be relatively smooth and continuous. The algorithm also has an easier time if there is only one (or very few) extremum to be found... - -For minimization, you can imagine the algorithm as a marble in a bowl. - -The marble will keep rolling down the slope of the bowl until it finds the bottom. - -It may overshoot, but once it hits the slope on the other side, it will continue to roll back -and forth until it comes to rest. - -Thus, when deciding whether numerical optimization is an effective method for a -particular problem, you could try visualizing the function to determine whether a marble -would be able to come to rest at the extreme values you are looking for. - -### Application: Consumer Theory - -A common use of maximization in economics is to model -optimal consumption decisions . - -#### Preferences and Utility Functions - -To summarize introductory economics, take a set of -[preferences](https://en.wikipedia.org/wiki/Preference_%28economics%29) of consumers over "bundles" -of goods (e.g. 2 apples and 3 oranges is preferred to 3 apples and 2 oranges, or a 100% chance to -win $1$ dollar is preferred to a 50% chance to win $2.10$ dollars). - -Under certain assumptions, you rationalize the preferences as a utility function over the different -goods (always remembering that the utility is simply a tool to order preferences and the numbers are -usually not meaningful themselves). - -For example, consider a utility function over bundles of bananas (B) and apples (A) - -$$ -U(B, A) = B^{\alpha}A^{1-\alpha} -$$ - -Where $\alpha \in [0,1]$. - -First, let's take a look at this particular utility function. - -```{code-cell} python -def U(A, B, alpha=1/3): - return B**alpha * A**(1-alpha) - -fig, ax = plt.subplots() -B = 1.5 -A = np.linspace(1, 10, 100) -ax.plot(A, U(A, B)) -ax.set_xlabel("A") -ax.set_ylabel("U(B=1.5, A)") -``` - -We note that - -- $U(B,1)$ is always higher with more B, hence, consuming more bananas has a -: positive marginal utility i.e. $\frac{d U(B,1)}{d B} > 0$. -- The more bananas we consume, the smaller the change in marginal utility, i.e. - $\frac{d^2 U(B,1)}{d B^2} < 0$. - -If we plot both the $B$ and the $A$, we can see how the utility changes with different -bundles. - -```{code-cell} python -fig, ax = plt.subplots() -B = np.linspace(1, 20, 100).reshape((100, 1)) -contours = ax.contourf(A, B.flatten(), U(A, B)) -fig.colorbar(contours) -ax.set_xlabel("A") -ax.set_ylabel("B") -ax.set_title("U(A,B)") -``` - -We can find the bundles between which the consumer would be indifferent by fixing a -utility $\bar{U}$ and by determining all combinations of $A$ and $B$ where -$\bar{U} = U(B, A)$. - -In this example, we can implement this calculation by letting $B$ be the variable on the -x-axis and solving for $A(\bar{U}, B)$ - -$$ -A(B, \bar{U}) = U^{\frac{1}{1-\alpha}}B^{\frac{-\alpha}{1-\alpha}} -$$ - -```{code-cell} python -def A_indifference(B, ubar, alpha=1/3): - return ubar**(1/(1-alpha)) * B**(-alpha/(1-alpha)) - -def plot_indifference_curves(ax, alpha=1/3): - ubar = np.arange(1, 11, 2) - ax.plot(B, A_indifference(B, ubar, alpha)) - ax.legend([r"$\bar{U}$" + " = {}".format(i) for i in ubar]) - ax.set_xlabel("B") - ax.set_ylabel(r"$A(B, \bar{U}$)") - -fig, ax = plt.subplots() -plot_indifference_curves(ax) -``` - -Note that in every case, if you increase either the number of apples or bananas (holding the other -fixed), you reach a higher indifference curve. - -Consequently, in a world without scarcity or budgets, consumers would consume -an arbitrarily high number of both to maximize their utility. - -#### Budget Constraints - -While the above example plots consumer preferences, it says nothing about what the consumers can afford. - -The simplest sort of constraint is a budget constraint where bananas and apples both have a price -and the consumer has a limited amount of funds. - -If the prices per banana and per apple are identical, no matter how many you consume, then the -affordable bundles are simply all pairs of apples and bananas below the line. -$p_a A + p_b B \leq W$. - -For example, if consumer has a budget of $W$, the price of apples is $p_A = 2$ dollars per -apple, and the price of bananas is normalized to be $p_B = 1$ dollar per banana, then the consumer -can afford anything below the line. - -$$ -2 A + B \leq W -$$ - -Or, letting $W = 20$ and plotting - -```{code-cell} python -def A_bc(B, W=20, pa=2): - "Given B, W, and pa return the max amount of A our consumer can afford" - return (W - B) / pa - -def plot_budget_constraint(ax, W=20, pa=2): - B_bc = np.array([0, W]) - A = A_bc(B_bc, W, pa) - ax.plot(B_bc, A) - ax.fill_between(B_bc, 0, A, alpha=0.2) - ax.set_xlabel("B") - ax.set_ylabel("A") - return ax - -fig, ax = plt.subplots() -plot_budget_constraint(ax, 20, 2) -``` - -While the consumer can afford any of the bundles in that area, most will not be optimal. - -#### Optimal Choice - -Putting the budget constraints and the utility functions together lets us visualize the optimal -decision of a consumer. Choose the bundle with the highest possible indifference curve within its -budget set. - -```{code-cell} python -fig, ax = plt.subplots() -plot_indifference_curves(ax) -plot_budget_constraint(ax) -``` - -We have several ways to find the particular point $A, B$ of maximum utility, such as -finding the point where the indifference curve and the budget constraint have the same slope, but a -simple approach is to just solve the direct maximization problem. - -$$ -\begin{aligned} -\max_{A, B} & B^{\alpha}A^{1-\alpha}\\ -\text{s.t. } & p_A A + B \leq W -\end{aligned} -$$ - -Solving this problem directly requires solving a multi-dimensional constrained optimization problem, -where scipy -has several options. - -For this particular problem, we notice two things: (1) The utility function is increasing in both -$A$ and $B$, and (2) there are only 2 goods. - -This allows us 1) to assume that the budget constraint holds at equality, $p_a A + B = W$, 2) to -form a new function $A(B) = (W - B) / p_a$ by rearranging the budget constraint at equality, and -3) to substitute that function directly to form: - -$$ -\max_{B} B^{\alpha}A(B)^{1-\alpha} -$$ - -Compared to before, this problem has been turned into an unconstrained univariate optimization -problem. - -To implement this in code, notice that the $A(B)$ function is what we defined before -as `A_bc`. - -We will solve this by using the function `scipy.optimize.minimize_scalar`, which takes a function -`f(x)` and returns the value of `x` that minimizes `f`. - -```{code-cell} python -from scipy.optimize import minimize_scalar - -def objective(B, W=20, pa=2): - """ - Return value of -U for a given B, when we consume as much A as possible - - Note that we return -U because scipy wants to minimize functions, - and the value of B that minimizes -U will maximize U - """ - A = A_bc(B, W, pa) - return -U(A, B) - -result = minimize_scalar(objective) -optimal_B = result.x -optimal_A = A_bc(optimal_B, 20, 2) -optimal_U = U(optimal_A, optimal_B) - -print("The optimal U is ", optimal_U) -print("and was found at (A,B) =", (optimal_A, optimal_B)) -``` - -This allows us to do experiments, such as examining how consumption patterns change as prices or -wealth levels change. - -```{code-cell} python -# Create various prices -n_pa = 50 -prices_A = np.linspace(0.5, 5.0, n_pa) -W = 20 - -# Create lists to store the results of the optimal A and B calculation -optimal_As = [] -optimal_Bs = [] -for pa in prices_A: - result = minimize_scalar(objective, args=(W, pa)) - opt_B_val = result.x - - optimal_Bs.append(opt_B_val) - optimal_As.append(A_bc(opt_B_val, W, pa)) - -fig, ax = plt.subplots() - -ax.plot(prices_A, optimal_As, label="Purchased Apples") -ax.plot(prices_A, optimal_Bs, label="Purchased Bananas") -ax.set_xlabel("Price of Apples") -ax.legend() -``` - -````{admonition} Exercise -:name: dir3-5-1 - -See exercise 1 in the {ref}`exercise list `. -```` - -#### Satiation Point - -The above example is a particular utility function where consumers prefer to "eat" as much as -possible of every good available, but that may not be the case for all preferences. - -When an optimum exists for the unconstrained problem (e.g. with an infinite budget), it is called a -bliss point, or satiation. - -Instead of bananas and apples, consider a utility function for potato chips (`P`) and chocolate -bars (`C`). - -$$ -U(P, C) = -(P - 20)^2 - 2 * (C - 1)^2 -$$ - -To numerically calculate the maximum (which you can probably see through inspection), one must directly solve the constrained maximization problem. - - -````{admonition} Exercise -:name: dir3-5-2 - -See exercise 2 in the {ref}`exercise list `. -```` - -(ex3-5)= -## Exercises - -### Exercise 1 - -Try solving the constrained maximization problem by hand via the Lagrangian method. - -Is it surprising that the demand for bananas is unaffected by the change in apple prices? - -Why might this be? - -({ref}`back to text `) - -### Exercise 2 - -Using a similar approach to that of the apples/bananas example above, solve for the optimal -basket of potato chips and chocolate bars when `W = 10`, `p_P = 1`, and `p_C = 2`. - -```{code-cell} python -W = 10 -p_P = 1 -p_C = 2 - -# Your code here -``` - -What is the optimal basket if we expand the budget constraint to have `W = 50`? - -```{code-cell} python -# Your code here -``` - -What is the optimal basket if we expand the budget constraint to have `W = 150`? - -```{code-cell} python -# Your code here -``` - -```{hint} -You can no longer assume that the `A_bc` function is always binding, as we did before, and will need to check results more carefully. - -While not required, you can take this opportunity to play around with other scipy functions such as Scipy optimize . -``` - -({ref}`back to text `) +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Optimization + +**Prerequisites** + +- {doc}`Introduction to Numpy ` +- {doc}`Applied Linear Algebra ` + +**Outcomes** + +- Perform optimization by hand using derivatives +- Understand ideas from gradient descent + + +```{literalinclude} ../_static/colab_light.raw +``` + +```{code-cell} python +# imports for later +import numpy as np +import matplotlib.pyplot as plt +%matplotlib inline +``` + +## What is Optimization? + +Optimization is the branch of mathematics focused on finding extreme values (max or min) of +functions. + +Optimization tools will appear in many places throughout this course, including: + +- Building economic models in which individuals make decisions that maximize their utility. +- Building statistical models and maximizing the fit of these models by optimizing certain fit + functions. + +In this lecture, we will focus mostly on the first to limit the moving pieces, but in other lectures, we'll discuss the second in detail. + +### Derivatives and Optima + +Here, we revisit some of the theory that you have already learned in your calculus class. + +Consider function $f(x)$ which maps a number into another number. We can say that any point +where $f'(x) = 0$ is a local extremum of $f$. + +Let's work through an example. Consider the function + +$$ +f(x) = x^4 - 3 x^2 +$$ + +Its derivative is given by + +$$ +\frac{\partial f}{\partial x} = 4 x^3 - 6 x +$$ + +Let's plot the function and its derivative to pick out the local extremum by hand. + +```{code-cell} python +def f(x): + return x**4 - 3*x**2 + + +def fp(x): + return 4*x**3 - 6*x + +# Create 100 evenly spaced points between -2 and 2 +x = np.linspace(-2., 2., 100) + +# Evaluate the functions at x values +fx = f(x) +fpx = fp(x) + +# Create plot +fig, ax = plt.subplots(1, 2) + +ax[0].plot(x, fx) +ax[0].set_title("Function") + +ax[1].plot(x, fpx) +ax[1].hlines(0.0, -2.5, 2.5, color="k", linestyle="--") +ax[1].set_title("Derivative") + +for _ax in ax: + _ax.spines["right"].set_visible(False) + _ax.spines["top"].set_visible(False) +``` + +If you stare at this picture, you can probably determine the the local maximum is at +$x = 0$ and the local minima at $x \approx -1$ and $x \approx 1$. + +To properly determine the minima and maxima, we find the solutions to $f'(x) = 0$ below: + +$$ +f'(x) = 4 x^3 - 6 x = 0 +$$ + +$$ +\rightarrow x = \left\{0, \frac{\sqrt{6}}{2}, \frac{-\sqrt{6}}{2} \right\} +$$ + +Let's check whether we can get the same answers with Python! To do this, we import a new +package that we haven't seen yet. + +```{code-cell} python +import scipy.optimize as opt +``` + +Then using the function definitions from earlier, we search for the minimum and maximum values. + +```{code-cell} python +# For a scalar problem, we give it the function and the bounds between +# which we want to search +neg_min = opt.minimize_scalar(f, [-2, -0.5]) +pos_min = opt.minimize_scalar(f, [0.5, 2.0]) +print("The negative minimum is: \n", neg_min) +print("The positive minimum is: \n", pos_min) +``` + +The scipy optimize package only has functions that find minimums... You might be wondering, then, how we +will verify our maximum value. + +It turns out that finding the maximum is equivalent to simply finding the minimum of the negative function. + +```{code-cell} python +# Create a function that evaluates to negative f +def neg_f(x): + return -f(x) + +max_out = opt.minimize_scalar(neg_f, [-0.35, 0.35]) +print("The maximum is: \n", max_out) +``` + +We won't dive into the details of optimization algorithms in this lecture, but we'll impart some brief +intuition to help you understand the types of problems these algorithms are good at solving and +the types of problems they will struggle with: + +The general intuition is that when you're finding a maximum, an algorithm takes a step +in the direction of the derivative... (Conversely, to find a minimum, the algorithm takes a step opposite the direction of the derivative.) +This requires the function to be relatively smooth and continuous. The algorithm also has an easier time if there is only one (or very few) extremum to be found... + +For minimization, you can imagine the algorithm as a marble in a bowl. + +The marble will keep rolling down the slope of the bowl until it finds the bottom. + +It may overshoot, but once it hits the slope on the other side, it will continue to roll back +and forth until it comes to rest. + +Thus, when deciding whether numerical optimization is an effective method for a +particular problem, you could try visualizing the function to determine whether a marble +would be able to come to rest at the extreme values you are looking for. + +### Application: Consumer Theory + +A common use of maximization in economics is to model +optimal consumption decisions . + +#### Preferences and Utility Functions + +To summarize introductory economics, take a set of +[preferences](https://en.wikipedia.org/wiki/Preference_%28economics%29) of consumers over "bundles" +of goods (e.g. 2 apples and 3 oranges is preferred to 3 apples and 2 oranges, or a 100% chance to +win $1$ dollar is preferred to a 50% chance to win $2.10$ dollars). + +Under certain assumptions, you rationalize the preferences as a utility function over the different +goods (always remembering that the utility is simply a tool to order preferences and the numbers are +usually not meaningful themselves). + +For example, consider a utility function over bundles of bananas (B) and apples (A) + +$$ +U(B, A) = B^{\alpha}A^{1-\alpha} +$$ + +Where $\alpha \in [0,1]$. + +First, let's take a look at this particular utility function. + +```{code-cell} python +def U(A, B, alpha=1/3): + return B**alpha * A**(1-alpha) + +fig, ax = plt.subplots() +B = 1.5 +A = np.linspace(1, 10, 100) +ax.plot(A, U(A, B)) +ax.set_xlabel("A") +ax.set_ylabel("U(B=1.5, A)") +``` + +We note that + +- $U(B,1)$ is always higher with more B, hence, consuming more bananas has a +: positive marginal utility i.e. $\frac{d U(B,1)}{d B} > 0$. +- The more bananas we consume, the smaller the change in marginal utility, i.e. + $\frac{d^2 U(B,1)}{d B^2} < 0$. + +If we plot both the $B$ and the $A$, we can see how the utility changes with different +bundles. + +```{code-cell} python +fig, ax = plt.subplots() +B = np.linspace(1, 20, 100).reshape((100, 1)) +contours = ax.contourf(A, B.flatten(), U(A, B)) +fig.colorbar(contours) +ax.set_xlabel("A") +ax.set_ylabel("B") +ax.set_title("U(A,B)") +``` + +We can find the bundles between which the consumer would be indifferent by fixing a +utility $\bar{U}$ and by determining all combinations of $A$ and $B$ where +$\bar{U} = U(B, A)$. + +In this example, we can implement this calculation by letting $B$ be the variable on the +x-axis and solving for $A(\bar{U}, B)$ + +$$ +A(B, \bar{U}) = U^{\frac{1}{1-\alpha}}B^{\frac{-\alpha}{1-\alpha}} +$$ + +```{code-cell} python +def A_indifference(B, ubar, alpha=1/3): + return ubar**(1/(1-alpha)) * B**(-alpha/(1-alpha)) + +def plot_indifference_curves(ax, alpha=1/3): + ubar = np.arange(1, 11, 2) + ax.plot(B, A_indifference(B, ubar, alpha)) + ax.legend([r"$\bar{U}$" + " = {}".format(i) for i in ubar]) + ax.set_xlabel("B") + ax.set_ylabel(r"$A(B, \bar{U}$)") + +fig, ax = plt.subplots() +plot_indifference_curves(ax) +``` + +Note that in every case, if you increase either the number of apples or bananas (holding the other +fixed), you reach a higher indifference curve. + +Consequently, in a world without scarcity or budgets, consumers would consume +an arbitrarily high number of both to maximize their utility. + +#### Budget Constraints + +While the above example plots consumer preferences, it says nothing about what the consumers can afford. + +The simplest sort of constraint is a budget constraint where bananas and apples both have a price +and the consumer has a limited amount of funds. + +If the prices per banana and per apple are identical, no matter how many you consume, then the +affordable bundles are simply all pairs of apples and bananas below the line. +$p_a A + p_b B \leq W$. + +For example, if consumer has a budget of $W$, the price of apples is $p_A = 2$ dollars per +apple, and the price of bananas is normalized to be $p_B = 1$ dollar per banana, then the consumer +can afford anything below the line. + +$$ +2 A + B \leq W +$$ + +Or, letting $W = 20$ and plotting + +```{code-cell} python +def A_bc(B, W=20, pa=2): + "Given B, W, and pa return the max amount of A our consumer can afford" + return (W - B) / pa + +def plot_budget_constraint(ax, W=20, pa=2): + B_bc = np.array([0, W]) + A = A_bc(B_bc, W, pa) + ax.plot(B_bc, A) + ax.fill_between(B_bc, 0, A, alpha=0.2) + ax.set_xlabel("B") + ax.set_ylabel("A") + return ax + +fig, ax = plt.subplots() +plot_budget_constraint(ax, 20, 2) +``` + +While the consumer can afford any of the bundles in that area, most will not be optimal. + +#### Optimal Choice + +Putting the budget constraints and the utility functions together lets us visualize the optimal +decision of a consumer. Choose the bundle with the highest possible indifference curve within its +budget set. + +```{code-cell} python +fig, ax = plt.subplots() +plot_indifference_curves(ax) +plot_budget_constraint(ax) +``` + +We have several ways to find the particular point $A, B$ of maximum utility, such as +finding the point where the indifference curve and the budget constraint have the same slope, but a +simple approach is to just solve the direct maximization problem. + +$$ +\begin{aligned} +\max_{A, B} & B^{\alpha}A^{1-\alpha}\\ +\text{s.t. } & p_A A + B \leq W +\end{aligned} +$$ + +Solving this problem directly requires solving a multi-dimensional constrained optimization problem, +where scipy +has several options. + +For this particular problem, we notice two things: (1) The utility function is increasing in both +$A$ and $B$, and (2) there are only 2 goods. + +This allows us 1) to assume that the budget constraint holds at equality, $p_a A + B = W$, 2) to +form a new function $A(B) = (W - B) / p_a$ by rearranging the budget constraint at equality, and +3) to substitute that function directly to form: + +$$ +\max_{B} B^{\alpha}A(B)^{1-\alpha} +$$ + +Compared to before, this problem has been turned into an unconstrained univariate optimization +problem. + +To implement this in code, notice that the $A(B)$ function is what we defined before +as `A_bc`. + +We will solve this by using the function `scipy.optimize.minimize_scalar`, which takes a function +`f(x)` and returns the value of `x` that minimizes `f`. + +```{code-cell} python +from scipy.optimize import minimize_scalar + +def objective(B, W=20, pa=2): + """ + Return value of -U for a given B, when we consume as much A as possible + + Note that we return -U because scipy wants to minimize functions, + and the value of B that minimizes -U will maximize U + """ + A = A_bc(B, W, pa) + return -U(A, B) + +result = minimize_scalar(objective) +optimal_B = result.x +optimal_A = A_bc(optimal_B, 20, 2) +optimal_U = U(optimal_A, optimal_B) + +print("The optimal U is ", optimal_U) +print("and was found at (A,B) =", (optimal_A, optimal_B)) +``` + +This allows us to do experiments, such as examining how consumption patterns change as prices or +wealth levels change. + +```{code-cell} python +# Create various prices +n_pa = 50 +prices_A = np.linspace(0.5, 5.0, n_pa) +W = 20 + +# Create lists to store the results of the optimal A and B calculation +optimal_As = [] +optimal_Bs = [] +for pa in prices_A: + result = minimize_scalar(objective, args=(W, pa)) + opt_B_val = result.x + + optimal_Bs.append(opt_B_val) + optimal_As.append(A_bc(opt_B_val, W, pa)) + +fig, ax = plt.subplots() + +ax.plot(prices_A, optimal_As, label="Purchased Apples") +ax.plot(prices_A, optimal_Bs, label="Purchased Bananas") +ax.set_xlabel("Price of Apples") +ax.legend() +``` + +````{admonition} Exercise +:name: dir3-5-1 + +See exercise 1 in the {ref}`exercise list `. +```` + +#### Satiation Point + +The above example is a particular utility function where consumers prefer to "eat" as much as +possible of every good available, but that may not be the case for all preferences. + +When an optimum exists for the unconstrained problem (e.g. with an infinite budget), it is called a +bliss point, or satiation. + +Instead of bananas and apples, consider a utility function for potato chips (`P`) and chocolate +bars (`C`). + +$$ +U(P, C) = -(P - 20)^2 - 2 * (C - 1)^2 +$$ + +To numerically calculate the maximum (which you can probably see through inspection), one must directly solve the constrained maximization problem. + + +````{admonition} Exercise +:name: dir3-5-2 + +See exercise 2 in the {ref}`exercise list `. +```` + +(ex3-5)= +## Exercises + +### Exercise 1 + +Try solving the constrained maximization problem by hand via the Lagrangian method. + +Is it surprising that the demand for bananas is unaffected by the change in apple prices? + +Why might this be? + +({ref}`back to text `) + +### Exercise 2 + +Using a similar approach to that of the apples/bananas example above, solve for the optimal +basket of potato chips and chocolate bars when `W = 10`, `p_P = 1`, and `p_C = 2`. + +```{code-cell} python +W = 10 +p_P = 1 +p_C = 2 + +# Your code here +``` + +What is the optimal basket if we expand the budget constraint to have `W = 50`? + +```{code-cell} python +# Your code here +``` + +What is the optimal basket if we expand the budget constraint to have `W = 150`? + +```{code-cell} python +# Your code here +``` + +```{hint} +You can no longer assume that the `A_bc` function is always binding, as we did before, and will need to check results more carefully. + +While not required, you can take this opportunity to play around with other scipy functions such as Scipy optimize . +``` + +({ref}`back to text `) diff --git a/lectures/scientific/plotting.md b/lectures/scientific/plotting.md index 7816ed0b..f20c0d57 100644 --- a/lectures/scientific/plotting.md +++ b/lectures/scientific/plotting.md @@ -1,206 +1,206 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Plotting - -**Prerequisites** - -- {doc}`Introduction to Numpy ` - -**Outcomes** - -- Understand components of matplotlib plots -- Make basic plots - - -```{literalinclude} ../_static/colab_light.raw -``` - -## Visualization - -One of the most important outputs of your analysis will be the visualizations that you choose to -communicate what you've discovered. - -Here are what some people -- whom we think have earned the right to an opinion on this -material -- have said with respect to data visualizations. - -> I spend hours thinking about how to get the story across in my visualizations. I don't mind taking that long because it's that five minutes of presenting it or someone getting it that can make or break a deal -- Goldman Sachs executive - - - - - -We won't have time to cover "how to make a compelling data visualization" in this lecture. - -Instead, we will focus on the basics of creating visualizations in Python. - -This will be a fast introduction, but this material appears in almost every -lecture going forward, which will help the concepts sink in. - -In almost any profession that you pursue, much of what you do involves communicating ideas to others. - -Data visualization can help you communicate these ideas effectively, and we encourage you to learn -more about what makes a useful visualization. - -We include some references that we have found useful below. - -* [The Functional Art: An introduction to information graphics and visualization](https://www.amazon.com/The-Functional-Art-introduction-visualization/dp/0321834739/) by Alberto Cairo -* [The Visual Display of Quantitative Information](https://www.amazon.com/Visual-Display-Quantitative-Information/dp/1930824130) by Edward Tufte -* [The Wall Street Journal Guide to Information Graphics: The Dos and Don'ts of Presenting Data, Facts, and Figures](https://www.amazon.com/Street-Journal-Guide-Information-Graphics/dp/0393347281) by Dona M Wong -* [Introduction to Data Visualization](http://paldhous.github.io/ucb/2016/dataviz/index.html) - -## `matplotlib` - -The most widely used plotting package in Python is matplotlib. - -The standard import alias is - -```{code-cell} python -import matplotlib.pyplot as plt -import numpy as np -``` - -Note above that we are using `matplotlib.pyplot` rather than just `matplotlib`. - -`pyplot` is a sub-module found in some large packages to further organize functions and types. We are able to give the `plt` alias to this sub-module. - -Additionally, when we are working in the notebook, we need tell matplotlib to display our images -inside of the notebook itself instead of creating new windows with the image. - -This is done by - -```{code-cell} python -%matplotlib inline -``` - -The commands with `%` before them are called [Magics](https://ipython.readthedocs.io/en/stable/interactive/magics.html). - -### First Plot - -Let's create our first plot! - -After creating it, we will walk through the steps one-by-one to understand what they do. - -```{code-cell} python -# Step 1 -fig, ax = plt.subplots() - -# Step 2 -x = np.linspace(0, 2*np.pi, 100) -y = np.sin(x) - -# Step 3 -ax.plot(x, y) -``` - -1. Create a figure and axis object which stores the information from our graph. -1. Generate data that we will plot. -1. Use the `x` and `y` data, and make a line plot on our axis, `ax`, by calling the `plot` method. - -### Difference between Figure and Axis - -We've found that the easiest way for us to distinguish between the figure and axis objects is to -think about them as a framed painting. - -The axis is the canvas; it is where we "draw" our plots. - -The figure is the entire framed painting (which inclues the axis itself!). - -We can also see this by setting certain elements of the figure to different colors. - -```{code-cell} python -fig, ax = plt.subplots() - -fig.set_facecolor("red") -ax.set_facecolor("blue") -``` - -This difference also means that you can place more than one axis on a figure. - -```{code-cell} python -# We specified the shape of the axes -- It means we will have two rows and three columns -# of axes on our figure -fig, axes = plt.subplots(2, 3) - -fig.set_facecolor("gray") - -# Can choose hex colors -colors = ["#065535", "#89ecda", "#ffd1dc", "#ff0000", "#6897bb", "#9400d3"] - -# axes is a numpy array and we want to iterate over a flat version of it -for (ax, c) in zip(axes.flat, colors): - ax.set_facecolor(c) - -fig.tight_layout() -``` - -### Functionality - -The matplotlib library is versatile and very flexible. - -You can see various examples of what it can do on the -[matplotlib example gallery](https://matplotlib.org/gallery.html). - -We work though a few examples to quickly introduce some possibilities. - -**Bar** - -```{code-cell} python -countries = ["CAN", "MEX", "USA"] -populations = [36.7, 129.2, 325.700] -land_area = [3.850, 0.761, 3.790] - -fig, ax = plt.subplots(2) - -ax[0].bar(countries, populations, align="center") -ax[0].set_title("Populations (in millions)") - -ax[1].bar(countries, land_area, align="center") -ax[1].set_title("Land area (in millions miles squared)") - -fig.tight_layout() -``` - -**Scatter and annotation** - -```{code-cell} python -N = 50 - -np.random.seed(42) - -x = np.random.rand(N) -y = np.random.rand(N) -colors = np.random.rand(N) -area = np.pi * (15 * np.random.rand(N))**2 # 0 to 15 point radii - -fig, ax = plt.subplots() - -ax.scatter(x, y, s=area, c=colors, alpha=0.5) - -ax.annotate( - "First point", xy=(x[0], y[0]), xycoords="data", - xytext=(25, -25), textcoords="offset points", - arrowprops=dict(arrowstyle="->", connectionstyle="arc3,rad=0.6") -) -``` - -**Fill between** - -```{code-cell} python -x = np.linspace(0, 1, 500) -y = np.sin(4 * np.pi * x) * np.exp(-5 * x) - -fig, ax = plt.subplots() - -ax.grid(True) -ax.fill(x, y) -``` - +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Plotting + +**Prerequisites** + +- {doc}`Introduction to Numpy ` + +**Outcomes** + +- Understand components of matplotlib plots +- Make basic plots + + +```{literalinclude} ../_static/colab_light.raw +``` + +## Visualization + +One of the most important outputs of your analysis will be the visualizations that you choose to +communicate what you've discovered. + +Here are what some people -- whom we think have earned the right to an opinion on this +material -- have said with respect to data visualizations. + +> I spend hours thinking about how to get the story across in my visualizations. I don't mind taking that long because it's that five minutes of presenting it or someone getting it that can make or break a deal -- Goldman Sachs executive + + + + + +We won't have time to cover "how to make a compelling data visualization" in this lecture. + +Instead, we will focus on the basics of creating visualizations in Python. + +This will be a fast introduction, but this material appears in almost every +lecture going forward, which will help the concepts sink in. + +In almost any profession that you pursue, much of what you do involves communicating ideas to others. + +Data visualization can help you communicate these ideas effectively, and we encourage you to learn +more about what makes a useful visualization. + +We include some references that we have found useful below. + +* [The Functional Art: An introduction to information graphics and visualization](https://www.amazon.com/The-Functional-Art-introduction-visualization/dp/0321834739/) by Alberto Cairo +* [The Visual Display of Quantitative Information](https://www.amazon.com/Visual-Display-Quantitative-Information/dp/1930824130) by Edward Tufte +* [The Wall Street Journal Guide to Information Graphics: The Dos and Don'ts of Presenting Data, Facts, and Figures](https://www.amazon.com/Street-Journal-Guide-Information-Graphics/dp/0393347281) by Dona M Wong +* [Introduction to Data Visualization](http://paldhous.github.io/ucb/2016/dataviz/index.html) + +## `matplotlib` + +The most widely used plotting package in Python is matplotlib. + +The standard import alias is + +```{code-cell} python +import matplotlib.pyplot as plt +import numpy as np +``` + +Note above that we are using `matplotlib.pyplot` rather than just `matplotlib`. + +`pyplot` is a sub-module found in some large packages to further organize functions and types. We are able to give the `plt` alias to this sub-module. + +Additionally, when we are working in the notebook, we need tell matplotlib to display our images +inside of the notebook itself instead of creating new windows with the image. + +This is done by + +```{code-cell} python +%matplotlib inline +``` + +The commands with `%` before them are called [Magics](https://ipython.readthedocs.io/en/stable/interactive/magics.html). + +### First Plot + +Let's create our first plot! + +After creating it, we will walk through the steps one-by-one to understand what they do. + +```{code-cell} python +# Step 1 +fig, ax = plt.subplots() + +# Step 2 +x = np.linspace(0, 2*np.pi, 100) +y = np.sin(x) + +# Step 3 +ax.plot(x, y) +``` + +1. Create a figure and axis object which stores the information from our graph. +1. Generate data that we will plot. +1. Use the `x` and `y` data, and make a line plot on our axis, `ax`, by calling the `plot` method. + +### Difference between Figure and Axis + +We've found that the easiest way for us to distinguish between the figure and axis objects is to +think about them as a framed painting. + +The axis is the canvas; it is where we "draw" our plots. + +The figure is the entire framed painting (which inclues the axis itself!). + +We can also see this by setting certain elements of the figure to different colors. + +```{code-cell} python +fig, ax = plt.subplots() + +fig.set_facecolor("red") +ax.set_facecolor("blue") +``` + +This difference also means that you can place more than one axis on a figure. + +```{code-cell} python +# We specified the shape of the axes -- It means we will have two rows and three columns +# of axes on our figure +fig, axes = plt.subplots(2, 3) + +fig.set_facecolor("gray") + +# Can choose hex colors +colors = ["#065535", "#89ecda", "#ffd1dc", "#ff0000", "#6897bb", "#9400d3"] + +# axes is a numpy array and we want to iterate over a flat version of it +for (ax, c) in zip(axes.flat, colors): + ax.set_facecolor(c) + +fig.tight_layout() +``` + +### Functionality + +The matplotlib library is versatile and very flexible. + +You can see various examples of what it can do on the +[matplotlib example gallery](https://matplotlib.org/gallery.html). + +We work though a few examples to quickly introduce some possibilities. + +**Bar** + +```{code-cell} python +countries = ["CAN", "MEX", "USA"] +populations = [36.7, 129.2, 325.700] +land_area = [3.850, 0.761, 3.790] + +fig, ax = plt.subplots(2) + +ax[0].bar(countries, populations, align="center") +ax[0].set_title("Populations (in millions)") + +ax[1].bar(countries, land_area, align="center") +ax[1].set_title("Land area (in millions miles squared)") + +fig.tight_layout() +``` + +**Scatter and annotation** + +```{code-cell} python +N = 50 + +np.random.seed(42) + +x = np.random.rand(N) +y = np.random.rand(N) +colors = np.random.rand(N) +area = np.pi * (15 * np.random.rand(N))**2 # 0 to 15 point radii + +fig, ax = plt.subplots() + +ax.scatter(x, y, s=area, c=colors, alpha=0.5) + +ax.annotate( + "First point", xy=(x[0], y[0]), xycoords="data", + xytext=(25, -25), textcoords="offset points", + arrowprops=dict(arrowstyle="->", connectionstyle="arc3,rad=0.6") +) +``` + +**Fill between** + +```{code-cell} python +x = np.linspace(0, 1, 500) +y = np.sin(4 * np.pi * x) * np.exp(-5 * x) + +fig, ax = plt.subplots() + +ax.grid(True) +ax.fill(x, y) +``` + diff --git a/lectures/scientific/randomness.md b/lectures/scientific/randomness.md index 2be87018..419375da 100644 --- a/lectures/scientific/randomness.md +++ b/lectures/scientific/randomness.md @@ -1,641 +1,641 @@ ---- -jupytext: - text_representation: - extension: .md - format_name: myst -kernelspec: - display_name: Python 3 - language: python - name: python3 ---- - -# Randomness - -**Prerequisites** - -- {doc}`Introduction to Numpy ` -- {doc}`Applied Linear Algebra ` - -**Outcomes** - -- Recall basic probability -- Draw random numbers from numpy -- Understand why simulation is useful -- Understand the basics of Markov chains and using the `quantecon` library to study them -- Simulate discrete and continuous random variables and processes - - -```{literalinclude} ../_static/colab_light.raw -``` - -## Randomness - -We will use the `numpy.random` package to simulate randomness in Python. - -This lecture will present various probability distributions and then use -numpy.random to numerically verify some of the facts associated with them. - -We import `numpy` as usual - -```{code-cell} python -import numpy as np -import matplotlib.pyplot as plt -%matplotlib inline -``` - -### Probability - -Before we learn how to use Python to generate randomness, we should make sure -that we all agree on some basic concepts of probability. - -To think about the probability of some event occurring, we must understand what possible -events could occur -- mathematicians refer to this as the *event space*. - -Some examples are - -* For a coin flip, the coin could either come up heads, tails, or land on its side. -* The inches of rain falling in a certain location on a given day could be any real - number between 0 and $\infty$. -* The change in an S&P500 stock price could be any real number between - $-$ opening price and $\infty$. -* An individual's employment status tomorrow could either be employed or unemployed. -* And the list goes on... - -Notice that in some of these cases, the event space can be counted (coin flip and employment status) -while in others, the event space cannot be counted (rain and stock prices). - -We refer to random variables with countable event spaces as *discrete random variables* and -random variables with uncountable event spaces as *continuous random variables*. - -We then call certain numbers 'probabilities' and associate them with events from the event space. - -The following is true about probabilities. - -1. The probability of any event must be greater than or equal to 0. -1. The probability of all events from the event space must sum (or integrate) to 1. -1. If two events cannot occur at same time, then the probability that at least one of them occurs is - the sum of the probabilities that each event occurs (known as independence). - -We won't rely on these for much of what we learn in this class, but occasionally, these facts will -help us reason through what is happening. - -### Simulating Randomness in Python - -One of the most basic random numbers is a variable that has equal probability of being any value -between 0 and 1. - -You may have previously learned about this probability distribution as the Uniform(0, 1). - -Let's dive into generating some random numbers. - -Run the code below multiple times and see what numbers you get. - -```{code-cell} python -np.random.rand() -``` - -We can also generate arrays of random numbers. - -```{code-cell} python -np.random.rand(25) -``` - -```{code-cell} python -np.random.rand(5, 5) -``` - -```{code-cell} python -np.random.rand(2, 3, 4) -``` - -### Why Do We Need Randomness? - -As economists and data scientists, we study complex systems. - -These systems have inherent randomness, but they do not readily reveal their underlying distribution -to us. - -In cases where we face this difficulty, we turn to a set of tools known as Monte Carlo -methods. - -These methods effectively boil down to repeatedly simulating some event (or events) and looking at -the outcome distribution. - -This tool is used to inform decisions in search and rescue missions, election predictions, sports, -and even by the Federal Reserve. - -The reasons that Monte Carlo methods work is a mathematical theorem known as the *Law of Large -Numbers*. - -The Law of Large Numbers basically says that under relatively general conditions, the distribution of simulated outcomes will mimic the true distribution as the number of simulated events goes to infinity. - -We already know how the uniform distribution looks, so let's demonstrate the Law of Large Numbers by approximating the uniform distribution. - -```{code-cell} python -# Draw various numbers of uniform[0, 1] random variables -draws_10 = np.random.rand(10) -draws_200 = np.random.rand(200) -draws_10000 = np.random.rand(10_000) - -# Plot their histograms -fig, ax = plt.subplots(3) - -ax[0].set_title("Histogram with 10 draws") -ax[0].hist(draws_10) - -ax[1].set_title("Histogram with 200 draws") -ax[1].hist(draws_200) - -ax[2].set_title("Histogram with 10,000 draws") -ax[2].hist(draws_10000) - -fig.tight_layout() -``` - - -````{admonition} Exercise -:name: dir3-4-1 - -See exercise 1 in the {ref}`exercise list `. -```` - - -### Discrete Distributions - -Sometimes we will encounter variables that can only take one of a -few possible values. - -We refer to this type of random variable as a discrete distribution. - -For example, consider a small business loan company. - -Imagine that the company's loan requires a repayment of $\$25,000$ and must be repaid 1 year -after the loan was made. - -The company discounts the future at 5%. - -Additionally, the loans made are repaid in full with 75% probability, while -$\$12,500$ of loans is repaid with probability 20%, and no repayment with 5% -probability. - -How much would the small business loan company be willing to loan if they'd like to --- on average -- break even? - -In this case, we can compute this by hand: - -The amount repaid, on average, is: $0.75(25,000) + 0.2(12,500) + 0.05(0) = 21,250$. - -Since we'll receive that amount in one year, we have to discount it: -$\frac{1}{1+0.05} 21,250 \approx 20238$. - -We can now verify by simulating the outcomes of many loans. - -```{code-cell} python -# You'll see why we call it `_slow` soon :) -def simulate_loan_repayments_slow(N, r=0.05, repayment_full=25_000.0, - repayment_part=12_500.0): - repayment_sims = np.zeros(N) - for i in range(N): - x = np.random.rand() # Draw a random number - - # Full repayment 75% of time - if x < 0.75: - repaid = repayment_full - elif x < 0.95: - repaid = repayment_part - else: - repaid = 0.0 - - repayment_sims[i] = (1 / (1 + r)) * repaid - - return repayment_sims - -print(np.mean(simulate_loan_repayments_slow(25_000))) -``` - -#### Aside: Vectorized Computations - -The code above illustrates the concepts we were discussing but is much slower than -necessary. - -Below is a version of our function that uses numpy arrays to perform computations -instead of only storing the values. - -```{code-cell} python -def simulate_loan_repayments(N, r=0.05, repayment_full=25_000.0, - repayment_part=12_500.0): - """ - Simulate present value of N loans given values for discount rate and - repayment values - """ - random_numbers = np.random.rand(N) - - # start as 0 -- no repayment - repayment_sims = np.zeros(N) - - # adjust for full and partial repayment - partial = random_numbers <= 0.20 - repayment_sims[partial] = repayment_part - - full = ~partial & (random_numbers <= 0.95) - repayment_sims[full] = repayment_full - - repayment_sims = (1 / (1 + r)) * repayment_sims - - return repayment_sims - -np.mean(simulate_loan_repayments(25_000)) -``` - -We'll quickly demonstrate the time difference in running both function versions. - -```{code-cell} python -%timeit simulate_loan_repayments_slow(250_000) -``` - -```{code-cell} python -%timeit simulate_loan_repayments(250_000) -``` - -The timings for my computer were 167 ms for `simulate_loan_repayments_slow` and 5.05 ms for -`simulate_loan_repayments`. - -This function is simple enough that both times are acceptable, but the 33x time difference could -matter in a more complicated operation. - -This illustrates a concept called *vectorization*, which is when computations -operate on an entire array at a time. - -In general, numpy code that is *vectorized* will perform better than numpy code that operates on one -element at a time. - -For more information see the -[QuantEcon lecture on performance Python](https://python-programming.quantecon.org/numba.html) code. - -#### Profitability Threshold - -Rather than looking for the break even point, we might be interested in the largest loan size that -ensures we still have a 95% probability of profitability in a year we make 250 loans. - -This is something that could be computed by hand, but it is much easier to answer through -simulation! - -If we simulate 250 loans many times and keep track of what the outcomes look like, then we can look -at the the 5th percentile of total repayment to find the loan size needed for 95% probability of -being profitable. - -```{code-cell} python -def simulate_year_of_loans(N=250, K=1000): - - # Create array where we store the values - avg_repayments = np.zeros(K) - for year in range(K): - - repaid_year = 0.0 - n_loans = simulate_loan_repayments(N) - avg_repayments[year] = n_loans.mean() - - return avg_repayments - -loan_repayment_outcomes = simulate_year_of_loans(N=250) - -# Think about why we use the 5th percentile of outcomes to -# compute when we are profitable 95% of time -lro_5 = np.percentile(loan_repayment_outcomes, 5) - -print("The largest loan size such that we were profitable 95% of time is") -print(lro_5) -``` - -Now let's consider what we could learn if our loan company had even more detailed information about -how the life of their loans progressed. - -#### Loan States - -Loans can have 3 potential statuses (or states): - -1. Repaying: Payments are being made on loan. -1. Delinquency: No payments are currently being made, but they might be made in the future. -1. Default: No payments are currently being made and no more payments will be made in future. - -The small business loans company knows the following: - -* If a loan is currently in repayment, then it has an 85% probability of continuing being repaid, a - 10% probability of going into delinquency, and a 5% probability of going into default. -* If a loan is currently in delinquency, then it has a 25% probability of returning to repayment, a - 60% probability of staying delinquent, and a 15% probability of going into default. -* If a loan is currently in default, then it remains in default with 100% probability. - -For simplicity, let's imagine that 12 payments are made during the life of a loan, even though -this means people who experience delinquency won't be required to repay their remaining balance. - -Let's write the code required to perform this dynamic simulation. - -```{code-cell} python -def simulate_loan_lifetime(monthly_payment): - - # Create arrays to store outputs - payments = np.zeros(12) - # Note: dtype 'U12' means a string with no more than 12 characters - statuses = np.array(4*["repaying", "delinquency", "default"], dtype="U12") - - # Everyone is repaying during their first month - payments[0] = monthly_payment - statuses[0] = "repaying" - - for month in range(1, 12): - rn = np.random.rand() - - if (statuses[month-1] == "repaying"): - if rn < 0.85: - payments[month] = monthly_payment - statuses[month] = "repaying" - elif rn < 0.95: - payments[month] = 0.0 - statuses[month] = "delinquency" - else: - payments[month] = 0.0 - statuses[month] = "default" - elif (statuses[month-1] == "delinquency"): - if rn < 0.25: - payments[month] = monthly_payment - statuses[month] = "repaying" - elif rn < 0.85: - payments[month] = 0.0 - statuses[month] = "delinquency" - else: - payments[month] = 0.0 - statuses[month] = "default" - else: # Default -- Stays in default after it gets there - payments[month] = 0.0 - statuses[month] = "default" - - return payments, statuses -``` - -We can use this model of the world to answer even more questions than the last model! - -For example, we can think about things like - -* For the defaulted loans, how many payments did they make before going into default? -* For those who partially repaid, how much was repaid before the 12 months was over? - -Unbeknownst to you, we have just introduced a well-known mathematical concept known as a Markov -chain. - -A Markov chain is a random process (Note: Random process is a sequence of random variables -observed over time) where the probability of something happening tomorrow only depends on what we -can observe today. - -In our small business loan example, this just means that the small business loan's repayment status -tomorrow only depended on what its repayment status was today. - -Markov chains often show up in economics and statistics, so we decided a simple introduction would -be helpful, but we leave out many details for the interested reader to find. - -A Markov chain is defined by three objects: - -1. A description of the possible states and their associated value. -1. A complete description of the probability of moving from one state to all other states. -1. An initial distribution over the states (often a vector of all zeros except for a single 1 for - some particular state). - -For the example above, we'll define each of these three things in the Python code below. - -```{code-cell} python -# 1. State description -state_values = ["repaying", "delinquency", "default"] - -# 2. Transition probabilities: encoded in a matrix (2d-array) where element [i, j] -# is the probability of moving from state i to state j -P = np.array([[0.85, 0.1, 0.05], [0.25, 0.6, 0.15], [0, 0, 1]]) - -# 3. Initial distribution: assume loans start in repayment -x0 = np.array([1, 0, 0]) -``` - -Now that we have these objects defined, we can use the a `MarkovChain` class from the -[quantecon python library](https://github.com/QuantEcon/QuantEcon.py/) to analyze this model. - -```{code-cell} python -import quantecon as qe - -mc = qe.markov.MarkovChain(P, state_values) -``` - -We can use the `mc` object to do common Markov chain operations. - -The `simulate` method will simulate the Markov chain for a specified number of steps: - -```{code-cell} python -mc.simulate(12, init="repaying") -``` - -Suppose we were to simulate the Markov chain for an infinite number of steps. - -Given the random nature of transitions, we might end up taking different paths at any given moment. - -We can summarize all possible paths over time by keeping track of a distribution. - -Below, we will print out the distribution for the first 10 time steps, -starting from a distribution where the debtor is repaying in the first step. - -```{code-cell} python -x = x0 -for t in range(10): - print(f"At time {t} the distribution is {x}") - x = mc.P.T @ x -``` - -````{admonition} Exercise -:name: dir3-4-2 - -See exercise 2 in the {ref}`exercise list `. -```` - -````{admonition} Exercise -:name: dir3-4-3 - -See exercise 3 in the {ref}`exercise list `. -```` - -### Continuous Distributions - -Recall that a continuous distribution is one where the value can take on an uncountable number of values. - -It differs from a discrete distribution in that the events are not -countable. - -We can use simulation to learn things about continuous distributions as we did with discrete -distributions. - -Let's use simulation to study what is arguably the most commonly encountered -distributions -- the normal distribution. - -The Normal (sometimes referred to as the Gaussian distribution) is bell-shaped and completely -described by the mean and variance of that distribution. - -The mean is often referred to as $\mu$ and the variance as $\sigma^2$. - -Let's take a look at the normal distribution. - -```{code-cell} python -# scipy is an extension of numpy, and the stats -# subpackage has tools for working with various probability distributions -import scipy.stats as st - -x = np.linspace(-5, 5, 100) - -# NOTE: first argument to st.norm is mean, second is standard deviation sigma (not sigma^2) -pdf_x = st.norm(0.0, 1.0).pdf(x) - -fig, ax = plt.subplots() - -ax.set_title(r"Normal Distribution ($\mu = 0, \sigma = 1$)") -ax.plot(x, pdf_x) -``` - -Another common continuous distribution used in economics is the gamma distribution. - -A gamma distribution is defined for all positive numbers and described by both a shape -parameter $k$ and a scale parameter $\theta$. - -Let's see what the distribution looks like for various choices of $k$ and $\theta$. - -```{code-cell} python -def plot_gamma(k, theta, x, ax=None): - if ax is None: - _, ax = plt.subplots() - - # scipy refers to the rate parameter beta as a scale parameter - pdf_x = st.gamma(k, scale=theta).pdf(x) - ax.plot(x, pdf_x, label=f"k = {k} theta = {theta}") - - return ax - -fig, ax = plt.subplots(figsize=(10, 6)) -x = np.linspace(0.1, 20, 130) -plot_gamma(2.0, 1.0, x, ax) -plot_gamma(3.0, 1.0, x, ax) -plot_gamma(3.0, 2.0, x, ax) -plot_gamma(3.0, 0.5, x, ax) -ax.set_ylim((0, 0.6)) -ax.set_xlim((0, 20)) -ax.legend(); -``` - -````{admonition} Exercise -:name: dir3-4-4 - -See exercise 4 in the {ref}`exercise list `. -```` - - -(ex3-4)= -## Exercises - -### Exercise 1 - -Wikipedia and other credible statistics sources tell us that the mean and -variance of the Uniform(0, 1) distribution are (1/2, 1/12) respectively. - -How could we check whether the numpy random numbers approximate these -values? - -({ref}`back to text `) - -### Exercise 2 - -In this exercise, we explore the long-run, or stationary, distribution of the Markov chain. - -The stationary distribution of a Markov chain is the probability distribution that would -result after an infinite number of steps *for any initial distribution*. - -Mathematically, a stationary distribution $x$ is a distribution where $x = P'x$. - -In the code cell below, use the `stationary_distributions` property of `mc` to -determine the stationary distribution of our Markov chain. - -After doing your computation, think about the answer... think about why our transition -probabilities must lead to this outcome. - - -```{code-cell} python -# your code here -``` - -({ref}`back to text `) - -### Exercise 3 - -Let's revisit the unemployment example from the {doc}`linear algebra lecture `. - -We'll repeat necessary details here. - -Consider an economy where in any given year, $\alpha = 5\%$ of workers lose their jobs, and -$\phi = 10\%$ of unemployed workers find jobs. - -Initially, 90% of the 1,000,000 workers are employed. - -Also suppose that the average employed worker earns 10 dollars, while an unemployed worker -earns 1 dollar per period. - -You now have four tasks: - -1. Represent this problem as a Markov chain by defining the three components defined above. -1. Construct an instance of the quantecon MarkovChain by using the objects defined in part 1. -1. Simulate the Markov chain 30 times for 50 time periods, and plot each chain over time (see - helper code below). -1. Determine the average long run payment for a worker in this setting - -```{hint} -Think about the stationary distribution. -``` - -```{code-cell} python -# define components here - -# construct Markov chain - -# simulate (see docstring for how to do many repetitions of -# the simulation in one function call) -# uncomment the lines below and fill in the blanks -# sim = XXXXX.simulate(XXXX) -# fig, ax = plt.subplots(figsize=(10, 8)) -# ax.plot(range(50), sim.T, alpha=0.4) - -# Long-run average payment -``` - -({ref}`back to text `) - - -### Exercise 4 - -Assume you have been given the opportunity to choose between one of three financial assets: - -You will be given the asset for free, allowed to hold it indefinitely, and keeping all payoffs. - -Also assume the assets' payoffs are distributed as follows: - -1. Normal with $\mu = 10, \sigma = 5$ -1. Gamma with $k = 5.3, \theta = 2$ -1. Gamma with $k = 5, \theta = 2$ - -Use `scipy.stats` to answer the following questions: - -- Which asset has the highest average returns? -- Which asset has the highest median returns? -- Which asset has the lowest coefficient of variation (standard deviation divided by mean)? -- Which asset would you choose? Why? - -```{hint} -There is not a single right answer here. Be creative -and express your preferences. -``` - -```{code-cell} python -# your code here -``` - -({ref}`back to text `) +--- +jupytext: + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +# Randomness + +**Prerequisites** + +- {doc}`Introduction to Numpy ` +- {doc}`Applied Linear Algebra ` + +**Outcomes** + +- Recall basic probability +- Draw random numbers from numpy +- Understand why simulation is useful +- Understand the basics of Markov chains and using the `quantecon` library to study them +- Simulate discrete and continuous random variables and processes + + +```{literalinclude} ../_static/colab_light.raw +``` + +## Randomness + +We will use the `numpy.random` package to simulate randomness in Python. + +This lecture will present various probability distributions and then use +numpy.random to numerically verify some of the facts associated with them. + +We import `numpy` as usual + +```{code-cell} python +import numpy as np +import matplotlib.pyplot as plt +%matplotlib inline +``` + +### Probability + +Before we learn how to use Python to generate randomness, we should make sure +that we all agree on some basic concepts of probability. + +To think about the probability of some event occurring, we must understand what possible +events could occur -- mathematicians refer to this as the *event space*. + +Some examples are + +* For a coin flip, the coin could either come up heads, tails, or land on its side. +* The inches of rain falling in a certain location on a given day could be any real + number between 0 and $\infty$. +* The change in an S&P500 stock price could be any real number between + $-$ opening price and $\infty$. +* An individual's employment status tomorrow could either be employed or unemployed. +* And the list goes on... + +Notice that in some of these cases, the event space can be counted (coin flip and employment status) +while in others, the event space cannot be counted (rain and stock prices). + +We refer to random variables with countable event spaces as *discrete random variables* and +random variables with uncountable event spaces as *continuous random variables*. + +We then call certain numbers 'probabilities' and associate them with events from the event space. + +The following is true about probabilities. + +1. The probability of any event must be greater than or equal to 0. +1. The probability of all events from the event space must sum (or integrate) to 1. +1. If two events cannot occur at same time, then the probability that at least one of them occurs is + the sum of the probabilities that each event occurs (known as independence). + +We won't rely on these for much of what we learn in this class, but occasionally, these facts will +help us reason through what is happening. + +### Simulating Randomness in Python + +One of the most basic random numbers is a variable that has equal probability of being any value +between 0 and 1. + +You may have previously learned about this probability distribution as the Uniform(0, 1). + +Let's dive into generating some random numbers. + +Run the code below multiple times and see what numbers you get. + +```{code-cell} python +np.random.rand() +``` + +We can also generate arrays of random numbers. + +```{code-cell} python +np.random.rand(25) +``` + +```{code-cell} python +np.random.rand(5, 5) +``` + +```{code-cell} python +np.random.rand(2, 3, 4) +``` + +### Why Do We Need Randomness? + +As economists and data scientists, we study complex systems. + +These systems have inherent randomness, but they do not readily reveal their underlying distribution +to us. + +In cases where we face this difficulty, we turn to a set of tools known as Monte Carlo +methods. + +These methods effectively boil down to repeatedly simulating some event (or events) and looking at +the outcome distribution. + +This tool is used to inform decisions in search and rescue missions, election predictions, sports, +and even by the Federal Reserve. + +The reasons that Monte Carlo methods work is a mathematical theorem known as the *Law of Large +Numbers*. + +The Law of Large Numbers basically says that under relatively general conditions, the distribution of simulated outcomes will mimic the true distribution as the number of simulated events goes to infinity. + +We already know how the uniform distribution looks, so let's demonstrate the Law of Large Numbers by approximating the uniform distribution. + +```{code-cell} python +# Draw various numbers of uniform[0, 1] random variables +draws_10 = np.random.rand(10) +draws_200 = np.random.rand(200) +draws_10000 = np.random.rand(10_000) + +# Plot their histograms +fig, ax = plt.subplots(3) + +ax[0].set_title("Histogram with 10 draws") +ax[0].hist(draws_10) + +ax[1].set_title("Histogram with 200 draws") +ax[1].hist(draws_200) + +ax[2].set_title("Histogram with 10,000 draws") +ax[2].hist(draws_10000) + +fig.tight_layout() +``` + + +````{admonition} Exercise +:name: dir3-4-1 + +See exercise 1 in the {ref}`exercise list `. +```` + + +### Discrete Distributions + +Sometimes we will encounter variables that can only take one of a +few possible values. + +We refer to this type of random variable as a discrete distribution. + +For example, consider a small business loan company. + +Imagine that the company's loan requires a repayment of $\$25,000$ and must be repaid 1 year +after the loan was made. + +The company discounts the future at 5%. + +Additionally, the loans made are repaid in full with 75% probability, while +$\$12,500$ of loans is repaid with probability 20%, and no repayment with 5% +probability. + +How much would the small business loan company be willing to loan if they'd like to +-- on average -- break even? + +In this case, we can compute this by hand: + +The amount repaid, on average, is: $0.75(25,000) + 0.2(12,500) + 0.05(0) = 21,250$. + +Since we'll receive that amount in one year, we have to discount it: +$\frac{1}{1+0.05} 21,250 \approx 20238$. + +We can now verify by simulating the outcomes of many loans. + +```{code-cell} python +# You'll see why we call it `_slow` soon :) +def simulate_loan_repayments_slow(N, r=0.05, repayment_full=25_000.0, + repayment_part=12_500.0): + repayment_sims = np.zeros(N) + for i in range(N): + x = np.random.rand() # Draw a random number + + # Full repayment 75% of time + if x < 0.75: + repaid = repayment_full + elif x < 0.95: + repaid = repayment_part + else: + repaid = 0.0 + + repayment_sims[i] = (1 / (1 + r)) * repaid + + return repayment_sims + +print(np.mean(simulate_loan_repayments_slow(25_000))) +``` + +#### Aside: Vectorized Computations + +The code above illustrates the concepts we were discussing but is much slower than +necessary. + +Below is a version of our function that uses numpy arrays to perform computations +instead of only storing the values. + +```{code-cell} python +def simulate_loan_repayments(N, r=0.05, repayment_full=25_000.0, + repayment_part=12_500.0): + """ + Simulate present value of N loans given values for discount rate and + repayment values + """ + random_numbers = np.random.rand(N) + + # start as 0 -- no repayment + repayment_sims = np.zeros(N) + + # adjust for full and partial repayment + partial = random_numbers <= 0.20 + repayment_sims[partial] = repayment_part + + full = ~partial & (random_numbers <= 0.95) + repayment_sims[full] = repayment_full + + repayment_sims = (1 / (1 + r)) * repayment_sims + + return repayment_sims + +np.mean(simulate_loan_repayments(25_000)) +``` + +We'll quickly demonstrate the time difference in running both function versions. + +```{code-cell} python +%timeit simulate_loan_repayments_slow(250_000) +``` + +```{code-cell} python +%timeit simulate_loan_repayments(250_000) +``` + +The timings for my computer were 167 ms for `simulate_loan_repayments_slow` and 5.05 ms for +`simulate_loan_repayments`. + +This function is simple enough that both times are acceptable, but the 33x time difference could +matter in a more complicated operation. + +This illustrates a concept called *vectorization*, which is when computations +operate on an entire array at a time. + +In general, numpy code that is *vectorized* will perform better than numpy code that operates on one +element at a time. + +For more information see the +[QuantEcon lecture on performance Python](https://python-programming.quantecon.org/numba.html) code. + +#### Profitability Threshold + +Rather than looking for the break even point, we might be interested in the largest loan size that +ensures we still have a 95% probability of profitability in a year we make 250 loans. + +This is something that could be computed by hand, but it is much easier to answer through +simulation! + +If we simulate 250 loans many times and keep track of what the outcomes look like, then we can look +at the the 5th percentile of total repayment to find the loan size needed for 95% probability of +being profitable. + +```{code-cell} python +def simulate_year_of_loans(N=250, K=1000): + + # Create array where we store the values + avg_repayments = np.zeros(K) + for year in range(K): + + repaid_year = 0.0 + n_loans = simulate_loan_repayments(N) + avg_repayments[year] = n_loans.mean() + + return avg_repayments + +loan_repayment_outcomes = simulate_year_of_loans(N=250) + +# Think about why we use the 5th percentile of outcomes to +# compute when we are profitable 95% of time +lro_5 = np.percentile(loan_repayment_outcomes, 5) + +print("The largest loan size such that we were profitable 95% of time is") +print(lro_5) +``` + +Now let's consider what we could learn if our loan company had even more detailed information about +how the life of their loans progressed. + +#### Loan States + +Loans can have 3 potential statuses (or states): + +1. Repaying: Payments are being made on loan. +1. Delinquency: No payments are currently being made, but they might be made in the future. +1. Default: No payments are currently being made and no more payments will be made in future. + +The small business loans company knows the following: + +* If a loan is currently in repayment, then it has an 85% probability of continuing being repaid, a + 10% probability of going into delinquency, and a 5% probability of going into default. +* If a loan is currently in delinquency, then it has a 25% probability of returning to repayment, a + 60% probability of staying delinquent, and a 15% probability of going into default. +* If a loan is currently in default, then it remains in default with 100% probability. + +For simplicity, let's imagine that 12 payments are made during the life of a loan, even though +this means people who experience delinquency won't be required to repay their remaining balance. + +Let's write the code required to perform this dynamic simulation. + +```{code-cell} python +def simulate_loan_lifetime(monthly_payment): + + # Create arrays to store outputs + payments = np.zeros(12) + # Note: dtype 'U12' means a string with no more than 12 characters + statuses = np.array(4*["repaying", "delinquency", "default"], dtype="U12") + + # Everyone is repaying during their first month + payments[0] = monthly_payment + statuses[0] = "repaying" + + for month in range(1, 12): + rn = np.random.rand() + + if (statuses[month-1] == "repaying"): + if rn < 0.85: + payments[month] = monthly_payment + statuses[month] = "repaying" + elif rn < 0.95: + payments[month] = 0.0 + statuses[month] = "delinquency" + else: + payments[month] = 0.0 + statuses[month] = "default" + elif (statuses[month-1] == "delinquency"): + if rn < 0.25: + payments[month] = monthly_payment + statuses[month] = "repaying" + elif rn < 0.85: + payments[month] = 0.0 + statuses[month] = "delinquency" + else: + payments[month] = 0.0 + statuses[month] = "default" + else: # Default -- Stays in default after it gets there + payments[month] = 0.0 + statuses[month] = "default" + + return payments, statuses +``` + +We can use this model of the world to answer even more questions than the last model! + +For example, we can think about things like + +* For the defaulted loans, how many payments did they make before going into default? +* For those who partially repaid, how much was repaid before the 12 months was over? + +Unbeknownst to you, we have just introduced a well-known mathematical concept known as a Markov +chain. + +A Markov chain is a random process (Note: Random process is a sequence of random variables +observed over time) where the probability of something happening tomorrow only depends on what we +can observe today. + +In our small business loan example, this just means that the small business loan's repayment status +tomorrow only depended on what its repayment status was today. + +Markov chains often show up in economics and statistics, so we decided a simple introduction would +be helpful, but we leave out many details for the interested reader to find. + +A Markov chain is defined by three objects: + +1. A description of the possible states and their associated value. +1. A complete description of the probability of moving from one state to all other states. +1. An initial distribution over the states (often a vector of all zeros except for a single 1 for + some particular state). + +For the example above, we'll define each of these three things in the Python code below. + +```{code-cell} python +# 1. State description +state_values = ["repaying", "delinquency", "default"] + +# 2. Transition probabilities: encoded in a matrix (2d-array) where element [i, j] +# is the probability of moving from state i to state j +P = np.array([[0.85, 0.1, 0.05], [0.25, 0.6, 0.15], [0, 0, 1]]) + +# 3. Initial distribution: assume loans start in repayment +x0 = np.array([1, 0, 0]) +``` + +Now that we have these objects defined, we can use the a `MarkovChain` class from the +[quantecon python library](https://github.com/QuantEcon/QuantEcon.py/) to analyze this model. + +```{code-cell} python +import quantecon as qe + +mc = qe.markov.MarkovChain(P, state_values) +``` + +We can use the `mc` object to do common Markov chain operations. + +The `simulate` method will simulate the Markov chain for a specified number of steps: + +```{code-cell} python +mc.simulate(12, init="repaying") +``` + +Suppose we were to simulate the Markov chain for an infinite number of steps. + +Given the random nature of transitions, we might end up taking different paths at any given moment. + +We can summarize all possible paths over time by keeping track of a distribution. + +Below, we will print out the distribution for the first 10 time steps, +starting from a distribution where the debtor is repaying in the first step. + +```{code-cell} python +x = x0 +for t in range(10): + print(f"At time {t} the distribution is {x}") + x = mc.P.T @ x +``` + +````{admonition} Exercise +:name: dir3-4-2 + +See exercise 2 in the {ref}`exercise list `. +```` + +````{admonition} Exercise +:name: dir3-4-3 + +See exercise 3 in the {ref}`exercise list `. +```` + +### Continuous Distributions + +Recall that a continuous distribution is one where the value can take on an uncountable number of values. + +It differs from a discrete distribution in that the events are not +countable. + +We can use simulation to learn things about continuous distributions as we did with discrete +distributions. + +Let's use simulation to study what is arguably the most commonly encountered +distributions -- the normal distribution. + +The Normal (sometimes referred to as the Gaussian distribution) is bell-shaped and completely +described by the mean and variance of that distribution. + +The mean is often referred to as $\mu$ and the variance as $\sigma^2$. + +Let's take a look at the normal distribution. + +```{code-cell} python +# scipy is an extension of numpy, and the stats +# subpackage has tools for working with various probability distributions +import scipy.stats as st + +x = np.linspace(-5, 5, 100) + +# NOTE: first argument to st.norm is mean, second is standard deviation sigma (not sigma^2) +pdf_x = st.norm(0.0, 1.0).pdf(x) + +fig, ax = plt.subplots() + +ax.set_title(r"Normal Distribution ($\mu = 0, \sigma = 1$)") +ax.plot(x, pdf_x) +``` + +Another common continuous distribution used in economics is the gamma distribution. + +A gamma distribution is defined for all positive numbers and described by both a shape +parameter $k$ and a scale parameter $\theta$. + +Let's see what the distribution looks like for various choices of $k$ and $\theta$. + +```{code-cell} python +def plot_gamma(k, theta, x, ax=None): + if ax is None: + _, ax = plt.subplots() + + # scipy refers to the rate parameter beta as a scale parameter + pdf_x = st.gamma(k, scale=theta).pdf(x) + ax.plot(x, pdf_x, label=f"k = {k} theta = {theta}") + + return ax + +fig, ax = plt.subplots(figsize=(10, 6)) +x = np.linspace(0.1, 20, 130) +plot_gamma(2.0, 1.0, x, ax) +plot_gamma(3.0, 1.0, x, ax) +plot_gamma(3.0, 2.0, x, ax) +plot_gamma(3.0, 0.5, x, ax) +ax.set_ylim((0, 0.6)) +ax.set_xlim((0, 20)) +ax.legend(); +``` + +````{admonition} Exercise +:name: dir3-4-4 + +See exercise 4 in the {ref}`exercise list `. +```` + + +(ex3-4)= +## Exercises + +### Exercise 1 + +Wikipedia and other credible statistics sources tell us that the mean and +variance of the Uniform(0, 1) distribution are (1/2, 1/12) respectively. + +How could we check whether the numpy random numbers approximate these +values? + +({ref}`back to text `) + +### Exercise 2 + +In this exercise, we explore the long-run, or stationary, distribution of the Markov chain. + +The stationary distribution of a Markov chain is the probability distribution that would +result after an infinite number of steps *for any initial distribution*. + +Mathematically, a stationary distribution $x$ is a distribution where $x = P'x$. + +In the code cell below, use the `stationary_distributions` property of `mc` to +determine the stationary distribution of our Markov chain. + +After doing your computation, think about the answer... think about why our transition +probabilities must lead to this outcome. + + +```{code-cell} python +# your code here +``` + +({ref}`back to text `) + +### Exercise 3 + +Let's revisit the unemployment example from the {doc}`linear algebra lecture `. + +We'll repeat necessary details here. + +Consider an economy where in any given year, $\alpha = 5\%$ of workers lose their jobs, and +$\phi = 10\%$ of unemployed workers find jobs. + +Initially, 90% of the 1,000,000 workers are employed. + +Also suppose that the average employed worker earns 10 dollars, while an unemployed worker +earns 1 dollar per period. + +You now have four tasks: + +1. Represent this problem as a Markov chain by defining the three components defined above. +1. Construct an instance of the quantecon MarkovChain by using the objects defined in part 1. +1. Simulate the Markov chain 30 times for 50 time periods, and plot each chain over time (see + helper code below). +1. Determine the average long run payment for a worker in this setting + +```{hint} +Think about the stationary distribution. +``` + +```{code-cell} python +# define components here + +# construct Markov chain + +# simulate (see docstring for how to do many repetitions of +# the simulation in one function call) +# uncomment the lines below and fill in the blanks +# sim = XXXXX.simulate(XXXX) +# fig, ax = plt.subplots(figsize=(10, 8)) +# ax.plot(range(50), sim.T, alpha=0.4) + +# Long-run average payment +``` + +({ref}`back to text `) + + +### Exercise 4 + +Assume you have been given the opportunity to choose between one of three financial assets: + +You will be given the asset for free, allowed to hold it indefinitely, and keeping all payoffs. + +Also assume the assets' payoffs are distributed as follows: + +1. Normal with $\mu = 10, \sigma = 5$ +1. Gamma with $k = 5.3, \theta = 2$ +1. Gamma with $k = 5, \theta = 2$ + +Use `scipy.stats` to answer the following questions: + +- Which asset has the highest average returns? +- Which asset has the highest median returns? +- Which asset has the lowest coefficient of variation (standard deviation divided by mean)? +- Which asset would you choose? Why? + +```{hint} +There is not a single right answer here. Be creative +and express your preferences. +``` + +```{code-cell} python +# your code here +``` + +({ref}`back to text `)