diff --git a/preparatory_notebooks/F2_linear_regression.ipynb b/preparatory_notebooks/F2_linear_regression.ipynb new file mode 100644 index 0000000..e0fa5c7 --- /dev/null +++ b/preparatory_notebooks/F2_linear_regression.ipynb @@ -0,0 +1,469 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "czDam0lOe-Xx" + }, + "source": [ + "# Notebook: F2 -- Linear Regression\n", + "\n", + "This notebook is complementary to lecture F2 about linear regressoin in order to highlight its key concepts to refresh your knowledge and gain intuition. The focus will be one\n", + "1. **Generating data** for supervised machine learning problems\n", + "2. **Fit linear models** to this data\n", + "3. **Evaluate** the fitted models to see how it performs on new data\n", + "\n", + "Please read the instructions and play around with the notebook where it is described.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "okNobqIurFRS" + }, + "source": [ + "---\n", + "\n", + "We start by importing necessary libraries. These libraries will be used throughout the course. If you are unfamiliar with them or need to refresh your knowledge, we recommended to take a look at the \"Introduction to Python\" material available on Studium." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "C8P18Lggc55a" + }, + "outputs": [], + "source": [ + "# Import necessary libraries\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "from sklearn.model_selection import train_test_split" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Tn5lIINVc9bc" + }, + "source": [ + "---\n", + "\n", + "## 1. Data Generation\n", + "\n", + "The first step for a supervised machine learning problem is to **get a dataset**. Each input $x_i$ comes with a corresponding output or label $y_i$. Here, $i$ denotes the index of a particular sample, and we collect $n$ samples in total. Compactly, we denote our dataset as $\\mathcal{T} = \\{x_i, y_i\\}_{i=1}^{n}$.\n", + "\n", + "Now we:\n", + "1. Generate a synthetic dataset $\\mathcal{T}$.\n", + "2. Split the dataset into one train dataset and one test dataset. The train dataset will be used to fit a model to the data, and the test dataset will be used to evaluate our model. \n", + "\n", + "The **goal** of our supervised machine learning method is to find a model that performs well the unseen test data. So it is important to leave out a part of the data (the test dataset) from the training process to be able to evaluate how well our model will perform on new input datapoints $x$ in the future.\n", + "\n", + "Below, we have some helper function to generate synthetic data, split the data and then plot them. Skip over and go to the next box." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HHjHft5Iga81" + }, + "outputs": [], + "source": [ + "# Generate synthetic data\n", + "def generate_synthetic_data():\n", + " np.random.seed(0)\n", + " X = np.random.rand(100, 1) # Feature (independent variable)\n", + " y = np.e * X + np.pi/2 + np.random.normal(0, 0.1, (100,1)) # Target (dependent variable)\n", + "\n", + " # Split the data into training and testing sets\n", + " X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n", + "\n", + "\n", + " # Print table header\n", + " print(\" | #x | #y |\")\n", + " print(\"-----------|-----|-----|\")\n", + " # Print train data row\n", + " print(f\"Train Data | {np.shape(X_train)[0]}{' ' * (3 - len(str(np.shape(X_train)[0])))} | {np.shape(y_train)[0]}{' ' * (3 - len(str(np.shape(y_train)[0])))} |\")\n", + " # Print test data row\n", + " print(f\"Test Data | {np.shape(X_test)[0]}{' ' * (3 - len(str(np.shape(X_test)[0])))} | {np.shape(y_test)[0]}{' ' * (3 - len(str(np.shape(y_test)[0])))} |\")\n", + " \n", + " return X_train, y_train, X_test, y_test\n", + "\n", + "# Plot the train data and test data\n", + "def plot_data():\n", + " plt.scatter(X_train, y_train, label='Training Data', alpha=0.5)\n", + " plt.scatter(X_test, y_test, label='Testing Data', alpha=0.5)\n", + " plt.xlabel('X')\n", + " plt.ylabel('y')\n", + " plt.legend()\n", + " plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "W9oSQ59uhW7U" + }, + "source": [ + "Now we can generate our dataset and plot them to get an understanding of what our data looks like. We plot both our train data (in blue) and our test data (in orange). \n", + "\n", + "Task:\n", + "- Run the cell below to visualize the synthetic train- and test datasets.\n", + "- Check if the test data are representative of the train data.\n", + "- Is there some relationship between $x$ and $y$?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "AzuGzLTZfsrk" + }, + "outputs": [], + "source": [ + "# generate data\n", + "X_train,y_train, X_test, y_test = generate_synthetic_data()\n", + "\n", + "# plot the data\n", + "plot_data()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HoYjDXdJPlwK" + }, + "source": [ + "## Explore the Family of Linear Models\n", + "From our plot, we notice that there seems to be a linear pattern: $y$ increases linearly with $x$. Hence, as model it might be suitable to use a **linear model** on the form:\n", + "\n", + "$$\n", + "y=θ_0+θ_1x + ϵ\n", + "$$\n", + "\n", + "We call $θ_0$ and $θ_1$ the **parameters** of our model, and $ϵ$ is a noise term capturing random errors in our data that our model does not account for.\n", + "\n", + "Finding a **good model**: This amount to fitting our model to the data. Meaning, finding good values of $θ_0$ and $θ_1$, so that $y_i\\approxθ_0+θ_1x_i$ holds for the samples in our training dataset $\\mathcal{T}_{train} = \\{x_i, y_i\\}_{i=1}^{m}$. Here, $m$ denotes the number of samples in our train set, i.e. $m=80$. \n", + "\n", + "Below is a helper function that plots the linear models. Skip over and go to the next box." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def plot_linear_models(\n", + " X, y, label='Training',\n", + " model1_params=[],\n", + " model2_params=[],\n", + " model3_params=[],\n", + "):\n", + " # model 1\n", + " if not all(element is None for element in model1_params):\n", + " y_model1 = model1_params[0] + X * model1_params[1]\n", + " plt.plot(X, y_model1, 'r', label='Model 1', alpha=0.5)\n", + " \n", + " # model 2\n", + " if not all(element is None for element in model2_params):\n", + " print('aaa')\n", + " y_model2 = model2_params[0] + X * model2_params[1]\n", + " plt.plot(X, y_model2, 'm', label='Model 2', alpha=0.5)\n", + " \n", + " # model 3\n", + " if not all(element is None for element in model3_params):\n", + " y_model3 = model3_params[0] + X * model3_params[1]\n", + " plt.plot(X, y_model3, 'g', label='Model 3', alpha=0.5)\n", + "\n", + " # Plot the training data\n", + " plt.scatter(X, y, label=label+' Data', alpha=0.5)\n", + " plt.xlabel('X')\n", + " plt.ylabel('y')\n", + " plt.legend()\n", + " plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this section we want to find a good linear model. We plot the training data, as well as the linear models which are fully described by $θ_0$ and $θ_1$. If a model fits the data, we can use our parameters along with the inputs of the data (variable $\\mathtt{X\\_train}$) to calculate predicted y-values which are close to the true y-values.\n", + "\n", + "Tasks:\n", + "\n", + "1. Run the code below and visualize model 1 with the given parameters. Does it fit the data?\n", + "2. Try to optimize the parameters of model 2 and model 3 to obtain better fits to the data. Replace the $\\mathtt{None}$ values with what you think are better parameters.\n", + "3. Which set of parameters fit the data best? \n", + "4. What does $θ_0$ and $θ_1$ stand for?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# model 1:\n", + "theta0 = 3\n", + "theta1 = -4\n", + "model1_params = [theta0, theta1]\n", + "\n", + "# model 2\n", + "theta0 = None\n", + "theta1 = None\n", + "model2_params = [theta0, theta1]\n", + "\n", + "\n", + "# model 3:\n", + "theta0 = None\n", + "theta1 = None\n", + "model3_params = [theta0, theta1]\n", + "\n", + "# plot model fits\n", + "plot_linear_models(X_train, y_train, 'Training',\n", + " model1_params, model2_params, model3_params)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "o7GqnLWy7dm0" + }, + "source": [ + "---\n", + "\n", + "## 2. Model evaluation\n", + "\n", + "Above you visually fit the linear model to the data. But how can we determine quantitatively which model is better? A common metric is the mean squared error (MSE):\n", + "\n", + "$$\n", + "\\frac{1}{m} \\sum_{i=1}^{m} {(y_i - f_{\\theta}(x_i))}^2\n", + "$$\n", + "\n", + "Here, $y_i$ denotes the true value for each input $x_i$ in the train dataset, and $f_{\\theta}(x_i) = \\theta_0 + \\theta_{1}x_i$ is the output of the model parameterized by our particular choice of $\\theta_0$ and $\\theta_1$.\n", + "\n", + "Below is a helper function which compute model predictions and the mean squared error of that model on the given data points. Skip over and go to the next box." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def model_prediction(X, model_params):\n", + " return model_params[0] + X * model_params[1]\n", + "\n", + "def MSE(y, pred):\n", + " m = len(y)\n", + " return np.sum((y - pred)**2)/m" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Following there are two code cells which perform the following:\n", + "- Compare the MSE of the three models on the **train dataset**.\n", + "- Compare the MSE of the three models on the **test dataset** and plot the function with the test data.\n", + "\n", + "Tasks:\n", + "1. Do you want to minimize or maximise MSE?\n", + "2. Given the train MSE, which model would you choose? Does this align with you visual impression from above?\n", + "3. Does the model generalize to unseen test data? Or more specifically: Does the MSE on train data and test data match?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "WNDAkUdSnaPz" + }, + "outputs": [], + "source": [ + "print('MSE on train data')\n", + "\n", + "pred1 = model_prediction(X_train, model1_params)\n", + "mse1 = MSE(y_train, pred1)\n", + "print(f'Model 1: {mse1:.3f}')\n", + "\n", + "pred2 = model_prediction(X_train, model2_params)\n", + "mse2 = MSE(y_train, pred2)\n", + "print(f'Model 2: {mse2:.3f}')\n", + "\n", + "pred3 = model_prediction(X_train, model3_params)\n", + "mse3 = MSE(y_train, pred3)\n", + "print(f'Model 3: {mse3:.3f}')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HXPpgsGvpJbk" + }, + "outputs": [], + "source": [ + "plot_linear_models(X_test, y_test, 'Test',\n", + " model1_params, model2_params, model3_params)\n", + "\n", + "print('MSE on test data')\n", + "\n", + "pred1 = model_prediction(X_test, model1_params)\n", + "mse1 = MSE(y_test, pred1)\n", + "print(f'Model 1: {mse1:.3f}')\n", + "\n", + "pred2 = model_prediction(X_test, model2_params)\n", + "mse2 = MSE(y_test, pred2)\n", + "print(f'Model 2: {mse2:.3f}')\n", + "\n", + "pred3 = model_prediction(X_test, model3_params)\n", + "mse3 = MSE(y_test, pred3)\n", + "print(f'Model 3: {mse3:.3f}')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ET6GtSDwrXGm" + }, + "source": [ + "---\n", + "\n", + "## 3. Finding the \"optimal\" linear model:\n", + "\n", + "Now we can not just fit a model by visual inspection but also select a model quantitatively by it's lowest MSE.\n", + "\n", + "But is there a **systematic way** to select the model parameters $\\theta_0$ and $\\theta_1$? We define the \"best possible linear model\" as the model generating the smallest MSE. Finding the model parameters that minimize the MSE is equivalent to finding the parameters that minimize the squared L2-norm of the residual vector. Thus, to find the best linear model we want to solve the following optimization problem with respect to $\\theta=[\\theta_0, \\theta_1]^\\top$:\n", + "\n", + "$$\n", + "\\hat{\\mathbf{\\theta}} = \\text{arg}\\min_{\\mathbf{\\theta}} \\frac{1}{m} \\sum_{i=1}^{m} {(y_i - f_{\\theta}(x_i))}^2 = \\text{arg}\\min_{\\mathbf{\\theta}} ||{(\\mathbf{y} - \\mathbf{X}\\mathbf{\\theta})}||_2^2\n", + "$$\n", + "\n", + "where\n", + "\\begin{align*}\n", + "\\mathbf{y} &= \\begin{bmatrix}\n", + " y_1 \\\\\n", + " y_2 \\\\\n", + " \\vdots\\\\\n", + " y_m\n", + "\\end{bmatrix}\n", + "& \\mathbf{X} &= \\begin{bmatrix}\n", + " 1 & x_1 \\\\\n", + " 1 & x_2 \\\\\n", + " \\vdots & \\vdots \\\\\n", + " 1 & x_m\n", + "\\end{bmatrix}\n", + "& \\mathbf{θ} &= \\begin{bmatrix}\n", + " θ_0 \\\\\n", + " θ_1 \\\\\n", + "\\end{bmatrix}\n", + "\\end{align*}\n", + "\n", + "We use $\\hat{\\mathbf{\\theta}}$ to denote our estimates of the true parameters.The solution to this optimization problem finds the least squares solution $\\hat{\\mathbf{\\theta}}$ to the following (overdetermined) linear system of equations:\n", + "\n", + "$$\n", + "\\mathbf{y}=\\mathbf{X}\\mathbf{\\theta}\n", + "$$\n", + "\n", + "\n", + "We say that we find the solution that minimizes the **least squares cost**. So when we say that our model is the **optimal** linear model, we mean that it is optimal in a least squares sense given the data.\n", + "\n", + "When working with linear models, the optimization problem above has a closed-form solution that can be found by solving the normal equations for $\\hat{\\mathbf{\\theta}}$:\n", + "\n", + "$$\n", + "\\mathbf{X}^T\\mathbf{X}\\hat{\\mathbf{\\theta}}=\\mathbf{X}^T\\mathbf{y}\n", + "$$\n", + "\n", + "In the cell below, we solve the normal equations using our train data:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "cVJhyY80Ie1K" + }, + "outputs": [], + "source": [ + "# construct the matrix X\n", + "n = len(X_train) # number of samples in our training data\n", + "X = np.ones((n, 2))\n", + "X[:,1] = X_train[:,0]\n", + "\n", + "# solve the normal equations\n", + "theta_ls = np.linalg.inv(X.T@X)@(X.T@y_train)\n", + "print('Least Squares Solution:')\n", + "print(f'theta_0 = {theta_ls[0][0]}')\n", + "print(f'theta_1 = {theta_ls[1][0]}')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "aFfFDLDZJ9o8" + }, + "source": [ + "Task:\n", + "1. Compare the optimal linear model with the best one that you found\n", + "2. What is the train and test MSE of this optimal model?" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "gu5bHt0zMOa9" + }, + "source": [ + "---\n", + "\n", + "# Take-home message\n", + "\n", + "\n", + "* Collect a dataset and split the dataset into a train set (for training your model) and a test set (for evaluating your model).\n", + "* Define the model $f_{\\theta}(x)$ you want to fit to the data. In the case of linear regression, we let $f_{\\theta}(x)$ be the family of linear models parameterized by $\\theta$.\n", + "* Choose an error metric and set up an optimization problem to find the optimal parameters $\\theta$. Here, we choose the MSE.\n", + "* Solve the optimization problem using the train dataset.\n", + "* Evaluate your model on the test dataset.\n", + "\n", + "**Recommendation for further reading:** The material covered in this notebook is well-covered in the beginning of Chapter 3.1 in the course book.\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + } + }, + "nbformat": 4, + "nbformat_minor": 1 +} diff --git a/preparatory_notebooks/F3_logistic_regression.ipynb b/preparatory_notebooks/F3_logistic_regression.ipynb new file mode 100644 index 0000000..413d3ff --- /dev/null +++ b/preparatory_notebooks/F3_logistic_regression.ipynb @@ -0,0 +1,351 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Notebook: F3 -- Logistic Regression\n", + "\n", + "This notebook is complementary to lecture F3 about Logistic Regression in order to highlight the key concepts. The focus will be on\n", + "1. Understanding and visualizing different loss functions: **Misclassification** and **Logistic Loss**\n", + "2. A basic classifier and its **Misclassification Loss** and modifying the parameters to see the effects on the loss.\n", + "3. Finally, the same classifier with its **Logistic Loss** and visualizing the loss surface.\n", + "\n", + "Please read the instructions and play around with the notebook where it is described." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "pip install -q ipywidgets" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# imports necessary libraries\n", + "%matplotlib inline\n", + "\n", + "import scipy.stats\n", + "import numpy as np\n", + "import matplotlib\n", + "import matplotlib.pyplot as plt\n", + "from matplotlib import cm\n", + "from mpl_toolkits.mplot3d import Axes3D\n", + "from ipywidgets import interact, widgets\n", + "\n", + "np.random.seed(42) # fix the random seed" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 1. Loss Functions: Misclassification and Logistic Loss\n", + "\n", + "In this section we will look at two loss functions: the misclassification loss and the logistic loss.\n", + "\n", + "The **misclassification loss** is defined as\n", + "\n", + "$$\n", + "\\ell_{\\text{misclass}}(y, \\hat{y}) = \\begin{cases}\n", + "0 & y\\hat{y} \\ge 0 \\\\\n", + "1 & y\\hat{y} \\lt 0\n", + "\\end{cases}\n", + "$$\n", + "\n", + "where $y$ is the true label and $\\hat{y}$ is the predicted label.\n", + "\n", + "Moreover, the **logistic loss** is defined as\n", + "\n", + "$$\n", + "\\ell_{\\text{logistic}}(y, \\hat{y}) = \\ln(1 + \\exp(-y\\hat{y}))\n", + "$$\n", + "\n", + "where $y$ is the true label and $\\hat{y}$ is the predicted label. \n", + "\n", + "The logistic loss is basically a *continuous* approximation to the misclassification loss, taking into account also **how far away** our predictions are from the real labels.\n", + "\n", + "Below, we have some helper functions for visualizing each of these loss functions. Skip over and go to the next box." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def plot_misclassification_loss():\n", + " yyhat = np.linspace(-5, 5, 100)\n", + " loss = np.where(yyhat < 0, 1, 0)\n", + " plt.figure(figsize=(5, 3))\n", + " plt.plot(yyhat, loss)\n", + " plt.xlabel('$y \\cdot \\hat{y}$')\n", + " plt.ylabel('Misclassification Loss')\n", + " plt.show()\n", + "\n", + "def plot_logistic_loss():\n", + " yyhat = np.linspace(-5, 5, 100)\n", + " loss = np.log(1 + np.exp(-yyhat))\n", + " plt.figure(figsize=(5, 3))\n", + " plt.plot(yyhat, loss, c='orange') # try also with semilogy\n", + " plt.xlabel('$y \\cdot \\hat{y}$')\n", + " plt.ylabel('Logistic Loss')\n", + " plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Below, we visualize each of the loss functions in a 2D plot. The x-axis is the product of the real value $y$ and the predicted one $\\hat{y}$, and the y-axis is the loss $\\ell(y, \\hat{y})$." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Plot the misclassification loss\n", + "plot_misclassification_loss()\n", + "\n", + "# Plot the logistic loss\n", + "plot_logistic_loss()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 2. A basic classifier + the Misclassification Loss\n", + "\n", + "In this section, we will look at a basic classifier and its misclassification loss. We will also modify the parameters to see the effects on the loss and the decision boundary.\n", + "\n", + "We model our basic classifier as follows:\n", + "\n", + "$$\n", + "\\hat{y} = \\text{sign}(\\theta_1 x_1 + \\theta_2 x_2)\n", + "$$\n", + "\n", + "where $\\theta_1$ and $\\theta_2$ are the weights. If we consider $\\theta= [\\theta_1, \\theta_2]$, then we can rewrite the above equation in a more compact form:\n", + "\n", + "$$\n", + "\\hat{y} = \\text{sign}(\\theta^T x)\n", + "$$\n", + "\n", + "Using this model, we can compute the average misclassification loss given a set of parameters $\\theta$. This will be our cost function:\n", + "\n", + "$$\n", + "J_{\\text{misclass}}(w) = \\frac{1}{N} \\sum_{i=1}^N \\ell_{\\text{misclass}}(y_i, \\hat{y}_i ; \\theta)\n", + "$$\n", + "\n", + "where $N$ is the number of samples in the dataset.\n", + "\n", + "Below we generate our dataset and there are some helper functions to visualize the decision boundary and calculate the misclassification loss. Skip over and go to the next box." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# generate synthetic dataset\n", + "x = np.random.rand(100, 2) * 4\n", + "y = np.where(x[:, 1] > x[:, 0], 1, -1)\n", + "\n", + "def calculate_misclassification_cost(x, y, theta):\n", + " return np.sum(np.where(y * np.dot(x, theta) < 0, 1, 0)) / len(y)\n", + "\n", + "def plot_decision_boundary(x, y, theta):\n", + " plt.figure(figsize=(5, 3))\n", + " plt.scatter(x[:, 0], x[:, 1], c=y, cmap=cm.coolwarm)\n", + " x1 = np.linspace(0, 4, 100)\n", + " x2 = -theta[0] / theta[1] * x1\n", + " plt.plot(x1 , x2, c='black')\n", + "\n", + " mesh = np.meshgrid(np.linspace(0, 4, 100),\n", + " np.linspace(0, 4, 100))\n", + "\n", + " Z = np.sign(np.dot(np.c_[mesh[0].ravel(), mesh[1].ravel()], theta))\n", + " Z = Z.reshape(mesh[0].shape)\n", + " plt.pcolormesh(mesh[0], mesh[1], Z, cmap=cm.coolwarm, alpha=0.2)\n", + "\n", + " plt.xlim([0, 4])\n", + " plt.ylim([0, 4])\n", + " plt.xlabel('$x_1$')\n", + " plt.ylabel('$x_2$')\n", + " plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Below, we plot the decision boundary for a classifier with the initialized $w$ parameters, alongside our data points which are colored according to their true label. Moreover, the misclassification cost is also calculated for the classifier and printed.\n", + "\n", + "Tasks:\n", + "1. Play around with the parameters $\\theta_1$ and $\\theta_2$ to: \n", + " - Observe how the decision boundary changes. \n", + " - Observe how the misclassification cost changes. \n", + "2. Try to minimize the cost by changing $\\theta_1$ and $\\theta_2$ in order to separate the data points as best as possible." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Our initial weight vector, i.e. theta1 and theta2\n", + "theta = [1, -0.3]\n", + "\n", + "# Plot the decision boundary\n", + "plot_decision_boundary(x, y, theta)\n", + "\n", + "# Calculate the misclassification cost\n", + "print(\"The misclassification rate: \", calculate_misclassification_cost(x, y, theta))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 2. The same classifier + the Logistic Loss\n", + "\n", + "In this section, we will look at the same classifier as before, but this time we will use the logistic loss instead of the misclassification loss. We will also visualize the loss surface in addition to the decision boundary.\n", + "\n", + "Remembering our definition of the logistic loss, we can compute the average logistic loss given a set of parameters $w$. This will be our cost function:\n", + "\n", + "$$\n", + "J_{\\text{logistic}}(w) = \\frac{1}{N} \\sum_{i=1}^N \\ell_{\\text{logistic}}(y_i, \\hat{y}_i ; \\theta)\n", + "$$\n", + "\n", + "where $N$ is the number of samples in the dataset.\n", + "\n", + "Below are some helper functions to calculate and visualize the logistic loss function. Skip over and go to the next box." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def calculate_logistic_cost(x, y, theta):\n", + " return np.sum(np.log(1 + np.exp(-y * np.dot(x, theta)))) / len(y)\n", + "\n", + "def plot_logistic_loss(x, y, azimuth, elevation):\n", + " theta1 = np.linspace(-1, 1, 50)\n", + " theta2 = np.linspace(-1, 1, 50)\n", + " theta1, theta2 = np.meshgrid(theta1, theta2)\n", + " loss = np.zeros(theta1.shape)\n", + " for i in range(len(theta1)):\n", + " for j in range(len(theta2)):\n", + " theta = [theta1[i, j], theta2[i, j]]\n", + " loss[i, j] = calculate_logistic_cost(x, y, theta)\n", + " fig = plt.figure(figsize=(8, 6))\n", + " ax = fig.add_subplot(projection='3d')\n", + " ax.plot_surface(theta1, theta2, loss, cmap=cm.viridis)\n", + " ax.set_xlabel(r'$\\theta_1$')\n", + " ax.set_ylabel(r'$\\theta_2$')\n", + " ax.set_zlabel('Logistic Loss')\n", + " ax.view_init(elevation, azimuth)\n", + " ax.tick_params(axis='x', which='major', pad=3)\n", + " ax.tick_params(axis='y', which='major', pad=3)\n", + " ax.set_xticks(np.linspace(-1, 1, 5))\n", + " ax.set_yticks(np.linspace(-1, 1, 5))\n", + " plt.show()\n", + "\n", + "def plot_log_loss_interactive(x, y):\n", + " interact(plot_logistic_loss, x=widgets.fixed(x), y=widgets.fixed(y), \n", + " azimuth=widgets.FloatSlider(min=0, max=360, step=10, value=0), \n", + " elevation=widgets.FloatSlider(min=0, max=90, step=10, value=20))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Below, we again draw the decision boundary of our classifier for the same dataset. However, this time we calculate and print the logistic loss instead of the misclassification loss. Moreover, the loss surface is also plotted, where you can see how the loss changes for different values of $w_1$ and $w_2$, for this specific dataset.\n", + "\n", + "Task:\n", + "1. Try again to minimize the cost by changing $\\theta_1$ and $\\theta_2$ in order to separate the data points as best as possible. Note how the best decision boundary does not yield a cost of 0, but rather a small value now. What does this mean for the classifier?\n", + "2. Inspect the loss surface and see how the loss changes for different values of $\\theta_1$ and $\\theta_2$. What parameters yield the lowest loss? Is it the same as the one you found?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Our initial weight vector, i.e. theta1 and theta2\n", + "theta = [1, -0.3]\n", + "\n", + "# Plot the decision boundary\n", + "plot_decision_boundary(x, y, theta)\n", + "print(\"The logistic loss: \", calculate_logistic_cost(x, y, theta))\n", + "\n", + "# Plot the logistic loss\n", + "plot_log_loss_interactive(x, y)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "# Take-home message\n", + "\n", + "* Logistic regression is used for classification problems.\n", + "* Logisitc regression is a linear model with a certain decision boundary.\n", + "* We can use the misclassification loss or the logistic loss. The latter gives a better notion of the distance of a sample to the decision boundary.\n", + "\n", + "**Recommendation for further reading:** The material covered in this notebook is well-covered in the beginning of Chapter 3.2 in the course book." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/preparatory_notebooks/F4_lda_qda.ipynb b/preparatory_notebooks/F4_lda_qda.ipynb new file mode 100644 index 0000000..257b3d2 --- /dev/null +++ b/preparatory_notebooks/F4_lda_qda.ipynb @@ -0,0 +1,783 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "wZut3XSZ0ryK" + }, + "source": [ + "# Notebook: F4 -- LDA, QDA\n", + "\n", + "This notebook is complementary to lecture F4 about LDA/QDA in order to highlight its key concepts. The focus will be on\n", + "1. Visualizing **multivariate Gaussian distributions**\n", + "2. **LDA**: Fitting Gaussian with the same covariance to data\n", + "3. **QDA**: Fitting Gaussian with varying covariance to data\n", + "\n", + "Please read the instructions and play around with the notebook where it is described." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "6B-KxYey_GUo", + "outputId": "85d9ce60-cdc4-45fe-b569-b1cd61eaa4f1" + }, + "outputs": [], + "source": [ + "pip install -q ipywidgets" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "NgfYHQxpoPvo" + }, + "outputs": [], + "source": [ + "# imports necessary libraries\n", + "%matplotlib inline\n", + "\n", + "import scipy.stats\n", + "import numpy as np\n", + "import matplotlib\n", + "import matplotlib.pyplot as plt\n", + "from matplotlib import cm\n", + "from mpl_toolkits.mplot3d import Axes3D\n", + "from ipywidgets import interact, widgets\n", + "\n", + "np.random.seed(42) # fix the random seed" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "YghGdL_lD0XT" + }, + "source": [ + "---\n", + "\n", + "## 1. Multivariate Gaussians\n", + "\n", + "A multivariate gaussian distribution for $k$ dimensions is given by the following probability density function (PDF):\n", + "\n", + "$f(x) = \\frac{1}{\\sqrt{(2 \\pi)^k \\det \\Sigma}}\\exp\\left( -\\frac{1}{2} (x - \\mu)^T \\Sigma^{-1} (x - \\mu) \\right)$\n", + "\n", + "with $\\Sigma$ as the covariance matrix and $\\mu$ as the mean vector.\n", + "Here, we will investigate how such PDFs look and what the influence of $\\mu$, $\\Sigma$ is.\n", + "\n", + "Below are helper functions to plot the results. Skip over those and go to the next text box." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "cMXLIHQ9_W04" + }, + "outputs": [], + "source": [ + "def plot_multivariate_gaussian(mu,Sigma,elevation=0, azimuth=90):\n", + " N = 100\n", + " X = np.linspace(-5, 5, N)\n", + " Y = np.linspace(-5, 5, N)\n", + " X, Y = np.meshgrid(X, Y)\n", + "\n", + " pos = np.empty(X.shape + (2,))\n", + " pos[:, :, 0] = X\n", + " pos[:, :, 1] = Y\n", + "\n", + " Z = scipy.stats.multivariate_normal.pdf(pos, mu, Sigma)\n", + "\n", + " fig = plt.figure()\n", + " ax = fig.add_subplot(projection='3d')\n", + " ax.plot_surface(X, Y, Z, rstride=3, cstride=3, linewidth=1, antialiased=True,\n", + " cmap=cm.viridis)\n", + " cset = ax.contourf(X, Y, Z, zdir='z', offset=-0.15, cmap=cm.viridis)\n", + " ax.set_xlabel('x1')\n", + " ax.set_ylabel('x2')\n", + " ax.set_title('Multivariate Gaussians')\n", + "\n", + " ax.set_zlim(-0.15,0.1)\n", + " ax.set_zticks(np.linspace(0,0.1,5))\n", + " ax.view_init(elev=elevation,azim=azimuth)\n", + "\n", + " plt.show()\n", + "\n", + "def plot_multivariate_gaussian_interactive(mu,Sigma, initial_elevation=90, initial_azimuth=0):\n", + " assert mu.shape == (2,), 'mu must be of shape (2,)'\n", + " assert Sigma.shape == (2,2), 'Sigma must be of shape (2,2)'\n", + " assert np.allclose(Sigma, Sigma.T), 'Sigma must be symmetric'\n", + " assert np.all(np.linalg.eigvals(Sigma) > 0), 'Sigma must be positive definite'\n", + "\n", + " interact(plot_multivariate_gaussian,\n", + " mu=widgets.fixed(mu),\n", + " Sigma=widgets.fixed(Sigma),\n", + " elevation=widgets.FloatSlider(min=0, max=90, step=1, value=initial_elevation),\n", + " azimuth=widgets.FloatSlider(min=0, max=360, step=1, value=initial_azimuth)\n", + " )\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "lqzuMaB12gmj" + }, + "source": [ + "Below, we plot a multivariate Gaussian for a specific mean vector $\\mu$ and covariance matrix $\\Sigma$. We visualize both, the PDF and a contour plot below.\n", + "\n", + "Tasks:\n", + "1. Change the elevation/azimuth using the sliders to familiarize yourself with the PDF and contour plot. Note: You first have to run all code cells until here to change the view.\n", + "2. Understand the effect of the mean vector $\\mu$: change $\\texttt{mu}$ and observe the change in the PDF\n", + "3. Understand the effect of the covariance matrix $\\Sigma$: change $\\texttt{Sigma}$ and observe the change. Some ideas:\n", + "- What happens with a diagonal $\\Sigma$?\n", + "- What happens if $\\Sigma_{11}>\\Sigma_{22}$ or vice versa?\n", + "- What happens when you add/remove off diagonal values? Note $\\Sigma$ has to be symmetric $\\Sigma_{12}=\\Sigma_{21}$." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 492, + "referenced_widgets": [ + "eb475a8cde92418d97e8604866d0a587", + "0632b737574b46d1968bd198902f0c35", + "1463125f5dd04e80a394067acb5f0462", + "4030a6224a1f4e629c0b8c764f68c80b", + "57980998a5874ce19178b2464c67584d", + "97184c81f3674bcc9d5d9146cfebbe01", + "5b57540dd7284c82a69e1282eda41ce0", + "924cd572d4cc4e9e9751cf4f82ab349c", + "288b525666bf40e7aa241b1a32d249a6", + "083a18e7851d43909c146404ec6f253d" + ] + }, + "id": "TOvRWhgkoUHm", + "outputId": "958de684-2fd3-48b5-c98e-f77cdc7a53c6" + }, + "outputs": [], + "source": [ + "# mean mu is a vector of size 2\n", + "mu = np.array([0., 0.])\n", + "# covariance Sigma is a matrix of size 2x2\n", + "Sigma = np.array([[1. , 0.5],\n", + " [0.5 , 1.]])\n", + "\n", + "# plot the multivariate gaussian\n", + "plot_multivariate_gaussian_interactive(\n", + " mu,Sigma,\n", + " initial_elevation=90, initial_azimuth=90,\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "V7wN-NU-D9VP" + }, + "source": [ + "\n", + "\n", + "---\n", + "\n", + "\n", + "## 2. Linear discriminant analysis (LDA)\n", + "\n", + "In linear discriminant analysis we fit $m$ Gaussians with the same covariance matrix to data with $m$ classes. Here we focus on $m=2$, the binary case.\n", + "\n", + "Below is a helper function to plot the results. Skip over it and go to the next text box." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "dbx8Vo4iBEsf" + }, + "outputs": [], + "source": [ + "def plot_lda(Sigma=None):\n", + " N = 150\n", + " limit = 4.5\n", + " X = np.linspace(-limit, limit, N)\n", + " Y = np.linspace(-limit, limit, N)\n", + " X, Y = np.meshgrid(X, Y)\n", + "\n", + " pos = np.empty(X.shape + (2,))\n", + " pos[:, :, 0] = X\n", + " pos[:, :, 1] = Y\n", + "\n", + " pi = np.array([0.5, 0.5])\n", + " mu_true = np.array([[-1., -1.],\n", + " [1., 1.]])\n", + " Sigma_true = np.array([[[ 1. , -0.7], [-0.7, 1.]],\n", + " [[ 1. , 0.8], [ 0.8, 1.3]]])\n", + "\n", + "\n", + " fig = plt.figure()\n", + " ax = fig.add_subplot()\n", + " if Sigma is not None:\n", + " assert Sigma.shape == (2,2), 'Sigma must be of shape (2,2)'\n", + " assert np.allclose(Sigma, Sigma.T), 'Sigma must be symmetric'\n", + " assert np.all(np.linalg.eigvals(Sigma) > 0), 'Sigma must be positive definite'\n", + " Z = pi[0]*scipy.stats.multivariate_normal.pdf(pos, mu_true[0], Sigma) + \\\n", + " pi[1]*scipy.stats.multivariate_normal.pdf(pos, mu_true[1], Sigma)\n", + " cset = ax.contourf(X, Y, Z, cmap=cm.viridis)\n", + "\n", + " s1 = scipy.stats.multivariate_normal.rvs(mu_true[0], Sigma_true[0], int(0.6*N), random_state=42)\n", + " s2 = scipy.stats.multivariate_normal.rvs(mu_true[1], Sigma_true[1], int(0.4*N), random_state=42)\n", + "\n", + " ax.scatter(s1[:, 0], s1[:, 1], marker='x', color='red')\n", + " ax.scatter(s2[:, 0], s2[:, 1], marker='o', facecolors='none', edgecolors='blue')\n", + "\n", + " ax.set_xlabel('x1')\n", + " ax.set_ylabel('x2')\n", + " ax.set_title('Linear discriminant analysis')\n", + " ax.set_aspect('equal')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "vprBT1r44y-g" + }, + "source": [ + "Below we have some data for two classes: red 'x' and blue 'o' represent each class. Tasks:\n", + "1. View the data by running the code cell below\n", + "2. Comment line 5 and run the cell with line 8 instead. You will see that two Gaussians with the same covariance specified by $\\texttt{Sigma}$ are fit to the two data clusters.\n", + "3. Try to modify $\\texttt{Sigma}$ such that the Gaussians fit the data as good as possible.\n", + "\n", + "LDA will use the two Gaussians and then add a **linear** decision boundary between the two fitted Gaussians to classify the data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 507 + }, + "id": "iCqd-5GdMXsz", + "outputId": "79d6f711-b9ed-4280-9b18-df39504e0e42" + }, + "outputs": [], + "source": [ + "Sigma = np.array([[1., 0.],\n", + " [0., 1.]])\n", + "\n", + "# First: plot the data with the following line to visualize it\n", + "# plot_lda(Sigma=None)\n", + "\n", + "# Second: comment the line above and replace with the one below to fit your Sigma\n", + "plot_lda(Sigma=Sigma)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xz7LXT-fS_IS" + }, + "source": [ + "---\n", + "\n", + "## 3. Quadratic discriminant analysis (QDA)\n", + "\n", + "\n", + "In quadratic discriminant analysis we extend LDA such that we use Gaussians with different covariance matrices $\\Sigma$.\n", + "\n", + "Below is a helper function to plot the results. Skip over it and go to the next text box." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "jk2-BR2CNA8K" + }, + "outputs": [], + "source": [ + "def plot_qda(mu, Sigma):\n", + " assert mu.shape == (2,), 'mu must be of shape (2,)'\n", + " assert Sigma.shape == (2,2), 'Sigma must be of shape (2,2)'\n", + " assert np.allclose(Sigma, Sigma.T), 'Sigma must be symmetric'\n", + " assert np.all(np.linalg.eigvals(Sigma) > 0), 'Sigma must be positive definite'\n", + "\n", + " N = 150\n", + " limit = 4.5\n", + " X = np.linspace(-limit, limit, N)\n", + " Y = np.linspace(-limit, limit, N)\n", + " X, Y = np.meshgrid(X, Y)\n", + "\n", + " pos = np.empty(X.shape + (2,))\n", + " pos[:, :, 0] = X\n", + " pos[:, :, 1] = Y\n", + "\n", + " pi = np.array([0.5, 0.5])\n", + " mu_true = np.array([[-1., -1.],\n", + " [1., 1.]])\n", + " Sigma_true = np.array([[[ 1. , -0.7], [-0.7, 1.]],\n", + " [[ 1. , 0.8], [ 0.8, 1.3]]])\n", + "\n", + "\n", + " Z = pi[0]*scipy.stats.multivariate_normal.pdf(pos, mu, Sigma) + \\\n", + " pi[1]*scipy.stats.multivariate_normal.pdf(pos, mu_true[1], Sigma_true[1])\n", + "\n", + " fig = plt.figure()\n", + " ax = fig.add_subplot()\n", + " cset = ax.contourf(X, Y, Z, cmap=cm.viridis)\n", + "\n", + " s1 = scipy.stats.multivariate_normal.rvs(mu_true[0], Sigma_true[0], int(0.6*N), random_state=42)\n", + " s2 = scipy.stats.multivariate_normal.rvs(mu_true[1], Sigma_true[1], int(0.4*N), random_state=42)\n", + "\n", + " ax.scatter(s1[:, 0], s1[:, 1], marker='x', color='red')\n", + " ax.scatter(s2[:, 0], s2[:, 1], marker='o', facecolors='none', edgecolors='blue')\n", + "\n", + " ax.set_xlabel('x1')\n", + " ax.set_ylabel('x2')\n", + " ax.set_title('Linear discriminant analysis')\n", + " ax.set_aspect('equal')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "eC7EcI4_6EOs" + }, + "source": [ + "We use the same data as for LDA above. Now the Gaussian for the blue class is well fit already. Your task:\n", + "1. Change the mean $\\texttt{mu}$ to place the second Gaussian well for the red class.\n", + "2. Change the covariance $\\texttt{Sigma}$ such that the second gaussian fits the red class well.\n", + "\n", + "QDA will use the two Gaussians and then add a **quadratic** decision boundary between the two fitted Gaussians to classify the data. It is therefore more flexible than LDA." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 507 + }, + "id": "5YxMtvb8UZQw", + "outputId": "bac19a34-5abc-4841-aab3-cbfa40a3aa0c" + }, + "outputs": [], + "source": [ + "mu = np.array([-3., -3.])\n", + "Sigma = np.array([[1., 0.],\n", + " [0., 1.]])\n", + "\n", + "\n", + "plot_qda(mu, Sigma)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "# Take-home message\n", + "\n", + "* Multivariate Gaussians are determined by the mean vector and a symmetric, positive-definite covariance matrix. \n", + "* Understand how the mean vector and covariance matrix influene the shapre of the Gaussian\n", + "* LDA fits a Gaussian with the same covariance to each class to the data.\n", + "* QDA fits a separate Gaussian (with different covariance) for each class to the data.\n", + "* Both, LDA and QDA are generative models since we can in principle sample from the fitted Gaussians to obtain new samples.\n", + "\n", + "**Recommendation for further reading:** The material covered in this notebook is well-covered in the beginning of Chapter 10.1 in the course book." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + }, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "0632b737574b46d1968bd198902f0c35": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatSliderModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatSliderModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "FloatSliderView", + "continuous_update": true, + "description": "elevation", + "description_tooltip": null, + "disabled": false, + "layout": "IPY_MODEL_97184c81f3674bcc9d5d9146cfebbe01", + "max": 90, + "min": 0, + "orientation": "horizontal", + "readout": true, + "readout_format": ".2f", + "step": 1, + "style": "IPY_MODEL_5b57540dd7284c82a69e1282eda41ce0", + "value": 90 + } + }, + "083a18e7851d43909c146404ec6f253d": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "1463125f5dd04e80a394067acb5f0462": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatSliderModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatSliderModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "FloatSliderView", + "continuous_update": true, + "description": "azimuth", + "description_tooltip": null, + "disabled": false, + "layout": "IPY_MODEL_924cd572d4cc4e9e9751cf4f82ab349c", + "max": 360, + "min": 0, + "orientation": "horizontal", + "readout": true, + "readout_format": ".2f", + "step": 1, + "style": "IPY_MODEL_288b525666bf40e7aa241b1a32d249a6", + "value": 91 + } + }, + "288b525666bf40e7aa241b1a32d249a6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "SliderStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "SliderStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "", + "handle_color": null + } + }, + "4030a6224a1f4e629c0b8c764f68c80b": { + "model_module": "@jupyter-widgets/output", + "model_module_version": "1.0.0", + "model_name": "OutputModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/output", + "_model_module_version": "1.0.0", + "_model_name": "OutputModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/output", + "_view_module_version": "1.0.0", + "_view_name": "OutputView", + "layout": "IPY_MODEL_083a18e7851d43909c146404ec6f253d", + "msg_id": "", + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": "
" + }, + "metadata": {}, + "output_type": "display_data" + } + ] + } + }, + "57980998a5874ce19178b2464c67584d": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "5b57540dd7284c82a69e1282eda41ce0": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "SliderStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "SliderStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "", + "handle_color": null + } + }, + "924cd572d4cc4e9e9751cf4f82ab349c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "97184c81f3674bcc9d5d9146cfebbe01": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "eb475a8cde92418d97e8604866d0a587": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "VBoxModel", + "state": { + "_dom_classes": [ + "widget-interact" + ], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "VBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "VBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_0632b737574b46d1968bd198902f0c35", + "IPY_MODEL_1463125f5dd04e80a394067acb5f0462", + "IPY_MODEL_4030a6224a1f4e629c0b8c764f68c80b" + ], + "layout": "IPY_MODEL_57980998a5874ce19178b2464c67584d" + } + } + } + } + }, + "nbformat": 4, + "nbformat_minor": 1 +}