From e022c471cd6d77b5fea52ae13eb898787e77a0e2 Mon Sep 17 00:00:00 2001 From: Bryon Tjanaka <38124174+btjanaka@users.noreply.github.com> Date: Thu, 21 Sep 2023 17:28:19 -0700 Subject: [PATCH] Revert tutorial links to latest (#381) ## Description This is part of the release process in CONTRIBUTING.md. ## TODO ## Questions ## Status - [x] I have read the guidelines in [CONTRIBUTING.md](https://github.com/icaros-usc/pyribs/blob/master/CONTRIBUTING.md) - [x] I have formatted my code using `yapf` - [x] I have tested my code by running `pytest` - [x] I have linted my code with `pylint` - [N/A] I have added a one-line description of my change to the changelog in `HISTORY.md` - [x] This PR is ready to go --- tutorials/arm_repertoire.ipynb | 18 +++++++++--------- tutorials/cma_mae.ipynb | 16 ++++++++-------- tutorials/fooling_mnist.ipynb | 4 ++-- tutorials/lsi_mnist.ipynb | 4 ++-- tutorials/lunar_lander.ipynb | 34 +++++++++++++++++----------------- tutorials/tom_cruise_dqd.ipynb | 12 ++++++------ 6 files changed, 44 insertions(+), 44 deletions(-) diff --git a/tutorials/arm_repertoire.ipynb b/tutorials/arm_repertoire.ipynb index 609c93561..bb4cfcd03 100644 --- a/tutorials/arm_repertoire.ipynb +++ b/tutorials/arm_repertoire.ipynb @@ -6,7 +6,7 @@ "source": [ "# Learning a Repertoire of Robot Arm Configurations\n", "\n", - "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/stable/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", + "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/latest/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", "\n", "In robotic manipulation, [inverse kinematics](https://en.wikipedia.org/wiki/Inverse_kinematics) involves figuring out how to configure the joints of an arm such that the end effector is at a certain position. For instance, in order to pick up a cup, a robot must move its gripper to the cup's position, and in order to catch a ball, a robot must move its hand to where it predicts the ball will be.\n", "\n", @@ -127,9 +127,9 @@ "\n", "We will use CMA-ME, with the following pyribs components, to search for arm configurations:\n", "\n", - "- [`CVTArchive`](https://docs.pyribs.org/en/stable/api/ribs.archives.CVTArchive.html): This archive uses a [Centroidal Voronoi Tesselation (CVT)](https://en.wikipedia.org/wiki/Centroidal_Voronoi_tessellation) to divide the measure space into evenly sized cells. It is typically used for high-dimensional measure spaces where the curse of dimensionality prevents one from using [`GridArchive`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html), but it works perfectly fine for lower dimensions too.\n", - "- [`EvolutionStartegyEmitter`](https://docs.pyribs.org/en/stable/api/ribs.emitters.EvolutionStrategyEmitter.html): This emitter is used to create the improvement emitter, which originated in the work of [Fontaine et al., 2020](https://arxiv.org/abs/1912.02400). It uses CMA-ES to search for solutions that improve the archive.\n", - "- [`Scheduler`](https://docs.pyribs.org/en/stable/api/ribs.schedulers.Scheduler.html): Binds all the components together and controls how the archive and emitters interact.\n", + "- [`CVTArchive`](https://docs.pyribs.org/en/latest/api/ribs.archives.CVTArchive.html): This archive uses a [Centroidal Voronoi Tesselation (CVT)](https://en.wikipedia.org/wiki/Centroidal_Voronoi_tessellation) to divide the measure space into evenly sized cells. It is typically used for high-dimensional measure spaces where the curse of dimensionality prevents one from using [`GridArchive`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html), but it works perfectly fine for lower dimensions too.\n", + "- [`EvolutionStartegyEmitter`](https://docs.pyribs.org/en/latest/api/ribs.emitters.EvolutionStrategyEmitter.html): This emitter is used to create the improvement emitter, which originated in the work of [Fontaine et al., 2020](https://arxiv.org/abs/1912.02400). It uses CMA-ES to search for solutions that improve the archive.\n", + "- [`Scheduler`](https://docs.pyribs.org/en/latest/api/ribs.schedulers.Scheduler.html): Binds all the components together and controls how the archive and emitters interact.\n", "\n", "First, let's create the archive. This line may take a minute or two to run because it initializes the archive, and initializing `CVTArchive` involves using [k-means clustering (Lloyd's algorithm)](https://scikit-learn.org/stable/modules/clustering.html#k-means) to generate the CVT." ] @@ -304,7 +304,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Using [`cvt_archive_heatmap`](https://docs.pyribs.org/en/stable/api/ribs.visualize.cvt_archive_heatmap.html) from [`ribs.visualize`](https://docs.pyribs.org/en/stable/api/ribs.visualize.html), we can also plot a heatmap showing all the positions for which we found an arm configuration, as well as the objective value of each configuration (higher is better). As we can see, CMA-ME found a solution for most of the possible positions (the arm can reach anywhere within a circle of radius 12 around its base), but there are still a few gaps." + "Using [`cvt_archive_heatmap`](https://docs.pyribs.org/en/latest/api/ribs.visualize.cvt_archive_heatmap.html) from [`ribs.visualize`](https://docs.pyribs.org/en/latest/api/ribs.visualize.html), we can also plot a heatmap showing all the positions for which we found an arm configuration, as well as the objective value of each configuration (higher is better). As we can see, CMA-ME found a solution for most of the possible positions (the arm can reach anywhere within a circle of radius 12 around its base), but there are still a few gaps." ] }, { @@ -395,7 +395,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We can retrieve `n` random solutions from the archive with [`sample_elites(n)`](https://docs.pyribs.org/en/stable/api/ribs.archives.CVTArchive.html#ribs.archives.CVTArchive.sample_elites). As our objective was to find configurations where the joint angles had small standard deviation from each other, it makes sense that all of our arms look \"smooth\" when visualized, as the joint angles are close to each other." + "We can retrieve `n` random solutions from the archive with [`sample_elites(n)`](https://docs.pyribs.org/en/latest/api/ribs.archives.CVTArchive.html#ribs.archives.CVTArchive.sample_elites). As our objective was to find configurations where the joint angles had small standard deviation from each other, it makes sense that all of our arms look \"smooth\" when visualized, as the joint angles are close to each other." ] }, { @@ -427,9 +427,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We can also retrieve solutions that reached specific positions with [`retrieve_single`](https://docs.pyribs.org/en/stable/api/ribs.archives.CVTArchive.html#ribs.archives.CVTArchive.retrieve_single). This method will check whether the archive cell with the specified measures contains a solution, and if so, it will return that solution. It looks like there is an arm configuration that can reach position (0,0), so let's see what that looks like.\n", + "We can also retrieve solutions that reached specific positions with [`retrieve_single`](https://docs.pyribs.org/en/latest/api/ribs.archives.CVTArchive.html#ribs.archives.CVTArchive.retrieve_single). This method will check whether the archive cell with the specified measures contains a solution, and if so, it will return that solution. It looks like there is an arm configuration that can reach position (0,0), so let's see what that looks like.\n", "\n", - "If you want to query a batch of solutions, you can similarly use [`retrieve`](https://docs.pyribs.org/en/stable/api/ribs.archives.CVTArchive.html#ribs.archives.CVTArchive.retrieve)." + "If you want to query a batch of solutions, you can similarly use [`retrieve`](https://docs.pyribs.org/en/latest/api/ribs.archives.CVTArchive.html#ribs.archives.CVTArchive.retrieve)." ] }, { @@ -472,7 +472,7 @@ "\n", "- Increasing the number of joints (increase `dof` under the Quality Diversity Algorithm Setup section)\n", "- Using different link lengths (set `link_lengths` in the same section as `dof`)\n", - "- Using different types of emitters, like the `EvolutionStrategyEmitter` with random direction ranking or the [`IsoLineEmitter`](https://docs.pyribs.org/en/stable/api/ribs.emitters.IsoLineEmitter.html)" + "- Using different types of emitters, like the `EvolutionStrategyEmitter` with random direction ranking or the [`IsoLineEmitter`](https://docs.pyribs.org/en/latest/api/ribs.emitters.IsoLineEmitter.html)" ] }, { diff --git a/tutorials/cma_mae.ipynb b/tutorials/cma_mae.ipynb index 47433cf49..e11fa3466 100644 --- a/tutorials/cma_mae.ipynb +++ b/tutorials/cma_mae.ipynb @@ -9,9 +9,9 @@ "source": [ "# Upgrading CMA-ME to CMA-MAE on the Sphere Benchmark\n", "\n", - "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/stable/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", + "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/latest/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", "\n", - "In the [previous tutorial](https://docs.pyribs.org/en/stable/tutorials/lunar_lander.html), we showed how to implement the CMA-ME algorithm in pyribs to tackle the lunar lander problem. CMA-ME enabled us to search for a diverse collection of high-performing lunar lander agents, including ones which landed like a space shuttle:" + "In the [previous tutorial](https://docs.pyribs.org/en/latest/tutorials/lunar_lander.html), we showed how to implement the CMA-ME algorithm in pyribs to tackle the lunar lander problem. CMA-ME enabled us to search for a diverse collection of high-performing lunar lander agents, including ones which landed like a space shuttle:" ] }, { @@ -231,7 +231,7 @@ "source": [ "### GridArchive\n", "\n", - "First, we will create the [`GridArchive`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html). The archive is 100x100 and stores 100-dimensional solutions. The ranges of the archive are determined by the maximum outputs of the measure function. Since the $clip(\\theta_i)$ function has a maximum output of 5.12, and each component of the output is the sum of $\\frac{n}{2}$ clipped components, the bounds of the measure space are $\\pm 5.12 * \\frac{n}{2}$.\n", + "First, we will create the [`GridArchive`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html). The archive is 100x100 and stores 100-dimensional solutions. The ranges of the archive are determined by the maximum outputs of the measure function. Since the $clip(\\theta_i)$ function has a maximum output of 5.12, and each component of the output is the sum of $\\frac{n}{2}$ clipped components, the bounds of the measure space are $\\pm 5.12 * \\frac{n}{2}$.\n", "\n", "Next, the key difference from CMA-ME is that this archive takes in the `learning_rate` parameter ($\\alpha$) which controls how quickly the threshold in each cell is updated. We set this to 0.01 from the CMA-MAE paper. The second difference is that this archive takes in a `threshold_min` parameter ($min_f$) which is the starting threshold for each cell. This threshold is typically the minimum objective in the problem, hence we choose 0.0.\n", "\n", @@ -292,7 +292,7 @@ "source": [ "### EvolutionStrategyEmitter\n", "\n", - "Next, we set up 15 instances of the [`EvolutionStrategyEmitter`](https://docs.pyribs.org/en/stable/api/ribs.emitters.EvolutionStrategyEmitter.html). The key difference from CMA-ME's `EvolutionStrategyEmitter` is that these emitters use _improvement ranking_ rather than _two-stage improvement ranking_. Two-stage improvement ranking (see the [Lunar Lander tutorial](https://docs.pyribs.org/en/stable/tutorials/lunar_lander.html)) considers the \"status\" and \"value\" of each solution, such that solutions which introduced new cells in the archive come first, followed by solutions which improved existing cells and solutions which were not added. In contrast, improvement ranking considers only the improvement values $\\Delta = f(\\theta') - t_e$, i.e., the difference between the objective values of incumbent solutions and the threshold of their corresponding cells in the archive.\n", + "Next, we set up 15 instances of the [`EvolutionStrategyEmitter`](https://docs.pyribs.org/en/latest/api/ribs.emitters.EvolutionStrategyEmitter.html). The key difference from CMA-ME's `EvolutionStrategyEmitter` is that these emitters use _improvement ranking_ rather than _two-stage improvement ranking_. Two-stage improvement ranking (see the [Lunar Lander tutorial](https://docs.pyribs.org/en/latest/tutorials/lunar_lander.html)) considers the \"status\" and \"value\" of each solution, such that solutions which introduced new cells in the archive come first, followed by solutions which improved existing cells and solutions which were not added. In contrast, improvement ranking considers only the improvement values $\\Delta = f(\\theta') - t_e$, i.e., the difference between the objective values of incumbent solutions and the threshold of their corresponding cells in the archive.\n", "\n", "There are further differences which set these emitters apart from the ones which are used in CMA-ME. First, the CMA-MAE emitters below use the \"mu\" `selection_rule`, which affects which solutions the emitter uses as parents for updating CMA-ES. The default \"filter\" selection (used in CMA-ME) uses all solutions which added a new cell or improved the archive, while \"mu\" selects the top half of all the generated solutions as parents. Second, the CMA-MAE emitters use a \"basic\" restart rule, which restarts the emitter according to the convergence rules of CMA-ES. The default \"no_improvement\" restart rule restarts the emitter when none of its generated solutions are inserted into the archive." ] @@ -330,7 +330,7 @@ "source": [ "### Scheduler\n", "\n", - "Finally, the [`Scheduler`](https://docs.pyribs.org/en/stable/api/ribs.schedulers.Scheduler.html) controls how the emitters interact with the archive and result archive. On every iteration, the scheduler calls the emitters to generate solutions. After the user evaluates these generated solutions, the scheduler inserts the solutions into both the archive and result archive. It then passes feedback from the archive (but not the result archive) to the emitters. In this manner, the emitters only interact with the archive, but the result archive stores all the best solutions found by CMA-MAE." + "Finally, the [`Scheduler`](https://docs.pyribs.org/en/latest/api/ribs.schedulers.Scheduler.html) controls how the emitters interact with the archive and result archive. On every iteration, the scheduler calls the emitters to generate solutions. After the user evaluates these generated solutions, the scheduler inserts the solutions into both the archive and result archive. It then passes feedback from the archive (but not the result archive) to the emitters. In this manner, the emitters only interact with the archive, but the result archive stores all the best solutions found by CMA-MAE." ] }, { @@ -388,7 +388,7 @@ "Supporting this behavior introduces two additional considerations:\n", "\n", "1. When we perform batch addition via the `add()` method on an archive with CMA-MAE, we apply the CMA-MAE batch addition rule in Appendix H of [Fontaine 2022](https://arxiv.org/abs/2205.10752). However, when using CMA-ME settings, we update the threshold by taking the maximum objective value to maintain consistency with the original algorithm.\n", - "2. When using the CMA-ME settings, using a regular `ImprovementRanker` (i.e. `ranker=\"imp\"`) is not advisable, as CMA-ME is only designed to work with two-stage improvement ranking (it is okay to use non-two stage versions of [other rankers](https://docs.pyribs.org/en/stable/api/ribs.emitters.rankers.html), however)." + "2. When using the CMA-ME settings, using a regular `ImprovementRanker` (i.e. `ranker=\"imp\"`) is not advisable, as CMA-ME is only designed to work with two-stage improvement ranking (it is okay to use non-two stage versions of [other rankers](https://docs.pyribs.org/en/latest/api/ribs.emitters.rankers.html), however)." ] }, { @@ -467,7 +467,7 @@ "source": [ "## Visualization\n", "\n", - "Now we visualize the result archive with [`grid_archive_heatmap`](https://docs.pyribs.org/en/stable/api/ribs.visualize.grid_archive_heatmap.html) from the [`ribs.visualize`](https://docs.pyribs.org/en/stable/api/ribs.visualize.html) module (as mentioned previously, the archive used in the main algorithm does not always hold the best solutions, while the result archive does)." + "Now we visualize the result archive with [`grid_archive_heatmap`](https://docs.pyribs.org/en/latest/api/ribs.visualize.grid_archive_heatmap.html) from the [`ribs.visualize`](https://docs.pyribs.org/en/latest/api/ribs.visualize.html) module (as mentioned previously, the archive used in the main algorithm does not always hold the best solutions, while the result archive does)." ] }, { @@ -829,7 +829,7 @@ "\n", "In this tutorial, we introduced CMA-MAE. We showed how it is implemented in pyribs, and we compared this implementation to that of CMA-ME. Then, we demonstrated CMA-MAE on the sphere linear projection benchmark, and we concluded by seeing how different settings of the learning rate $\\alpha$ affected is performance. Overall, CMA-MAE has strong theoretical and empirical properties which enable it to excel at QD problems like this one. For these reasons, we recommend CMA-MAE over CMA-ME for most QD optimization problems.\n", "\n", - "_For implementations of a wide variety of algorithms on the sphere benchmark, take a look at our [sphere example](https://docs.pyribs.org/en/stable/examples/sphere.html)._" + "_For implementations of a wide variety of algorithms on the sphere benchmark, take a look at our [sphere example](https://docs.pyribs.org/en/latest/examples/sphere.html)._" ] }, { diff --git a/tutorials/fooling_mnist.ipynb b/tutorials/fooling_mnist.ipynb index b53dc8aed..e22d7201d 100644 --- a/tutorials/fooling_mnist.ipynb +++ b/tutorials/fooling_mnist.ipynb @@ -6,7 +6,7 @@ "source": [ "# Generating Images to Fool an MNIST Classifier\n", "\n", - "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/stable/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", + "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/latest/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", "\n", "Despite their high performance on classification tasks such as MNIST, neural networks like the [LeNet-5](https://en.wikipedia.org/wiki/LeNet) have a weakness: they are easy to fool. Namely, given images like the ones below, a classifier may confidently believe that it is seeing certain digits, even though the images look like random noise to humans. Naturally, this phenomenon raises some concerns, especially when the network in question is used in a safety-critical system like a self-driving car. Given such unrecognizable input, one would hope that the network at least has low confidence in its prediction.\n", "\n", @@ -165,7 +165,7 @@ "\n", "Our classifier outputs a log probability vector with its belief that it is seeing each digit. Thus, our objective for each digit is to maximize the probability that the classifier assigns to the image associated with it. For instance, for digit 5, we want to generate an image that makes the classifier believe with high probability that it is seeing a 5.\n", "\n", - "In pyribs, we implement MAP-Elites with a [`GridArchive`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html) and a [`GaussianEmitter`](https://docs.pyribs.org/en/stable/api/ribs.emitters.GaussianEmitter.html). Below, we start by constructing the `GridArchive`. The archive has 10 cells and a range of (0,10). Since `GridArchive` was originally designed for continuous spaces, it does not directly support discrete spaces, but by using these settings, we have a cell for each digit from 0 to 9." + "In pyribs, we implement MAP-Elites with a [`GridArchive`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html) and a [`GaussianEmitter`](https://docs.pyribs.org/en/latest/api/ribs.emitters.GaussianEmitter.html). Below, we start by constructing the `GridArchive`. The archive has 10 cells and a range of (0,10). Since `GridArchive` was originally designed for continuous spaces, it does not directly support discrete spaces, but by using these settings, we have a cell for each digit from 0 to 9." ] }, { diff --git a/tutorials/lsi_mnist.ipynb b/tutorials/lsi_mnist.ipynb index f166fe944..a645e404e 100644 --- a/tutorials/lsi_mnist.ipynb +++ b/tutorials/lsi_mnist.ipynb @@ -6,7 +6,7 @@ "source": [ "# Illuminating the Latent Space of an MNIST GAN\n", "\n", - "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/stable/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", + "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/latest/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", "\n", "One of the most popular applications of [Generative Adversarial Networks](https://en.wikipedia.org/wiki/Generative_adversarial_network) is generating fake images. In particular, websites like [this person does not exist](https://thispersondoesnotexist.com) serve a GAN that generates fake images of people ([this x does not exist](https://thisxdoesnotexist.com) provides a comprehensive list of such websites). Such websites are entertaining, especially when one is asked to figure out [which face is real](https://www.whichfaceisreal.com/index.php).\n", "\n", @@ -223,7 +223,7 @@ "source": [ "## LSI with CMA-ME on MNIST GAN\n", "\n", - "After loading the GAN and the classifier, we can begin exploring the latent space of the GAN with the pyribs implementation of CMA-ME. Thus, we import and initialize the [`GridArchive`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html), [`EvolutionStrategyEmitter`](https://docs.pyribs.org/en/stable/api/ribs.emitters.EvolutionStrategyEmitter.html), and [`Scheduler`](https://docs.pyribs.org/en/stable/api/ribs.schedulers.Scheduler.html) from pyribs.\n", + "After loading the GAN and the classifier, we can begin exploring the latent space of the GAN with the pyribs implementation of CMA-ME. Thus, we import and initialize the [`GridArchive`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html), [`EvolutionStrategyEmitter`](https://docs.pyribs.org/en/latest/api/ribs.emitters.EvolutionStrategyEmitter.html), and [`Scheduler`](https://docs.pyribs.org/en/latest/api/ribs.schedulers.Scheduler.html) from pyribs.\n", "\n", "For the `GridArchive`, we choose a 2D measure space with \"boldness\" and \"lightness\" as the measures. We approximate \"boldness\" of a digit by counting the number of white pixels in the image, and we approximate \"lightness\" by averaging the values of the white pixels in the image. We define a \"white\" pixel as a pixel with value at least 0.5 (pixels are bounded to the range $[0,1]$). Since there are 784 pixels in an image, boldness is bounded to the range $[0, 784]$. Meanwhile, lightness is bounded to the range $[0.5, 1]$, as that is the range of a white pixel." ] diff --git a/tutorials/lunar_lander.ipynb b/tutorials/lunar_lander.ipynb index 5ebc65483..f816d2292 100644 --- a/tutorials/lunar_lander.ipynb +++ b/tutorials/lunar_lander.ipynb @@ -8,7 +8,7 @@ "source": [ "# Using CMA-ME to Land a Lunar Lander Like a Space Shuttle\n", "\n", - "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/stable/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", + "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/latest/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", "\n", "In the [Lunar Lander](https://gymnasium.farama.org/environments/box2d/lunar_lander/) environment, an agent controls a spaceship to touch down gently within a goal zone near the bottom of the screen. Typically, agents in Lunar Lander take a direct approach, hovering straight down:" ] @@ -129,7 +129,7 @@ "\n", "Though these trajectories look different, they all achieve good performance (200+), leading to an important insight: there are characteristics of landing a lunar lander that are not necessarily important for performance, but nonetheless determine the behavior of the lander. In quality diversity (QD) terms, we call these measures. In this tutorial, we will search for policies that yield different trajectories using the pyribs implementation of the QD algorithm [CMA-ME](https://arxiv.org/abs/1912.02400).\n", "\n", - "**_By the way: Recent work introduced [CMA-MAE](https://arxiv.org/abs/2205.10752), an algorithm which builds on CMA-ME and achieves higher performance in a variety of domains. Once you finish this tutorial, be sure to check out the [next tutorial](https://docs.pyribs.org/en/stable/tutorials/cma_mae.html) to learn about CMA-MAE._**" + "**_By the way: Recent work introduced [CMA-MAE](https://arxiv.org/abs/2205.10752), an algorithm which builds on CMA-ME and achieves higher performance in a variety of domains. Once you finish this tutorial, be sure to check out the [next tutorial](https://docs.pyribs.org/en/latest/tutorials/cma_mae.html) to learn about CMA-MAE._**" ] }, { @@ -140,7 +140,7 @@ "source": [ "## Setup\n", "\n", - "First, let's install pyribs and Gymnasium. [Gymnasium](https://gymnasium.farama.org) is the successor to [OpenAI Gym](https://www.gymlibrary.dev), which was deprecated in late 2022. We use the visualize extra of pyribs (`ribs[visualize]` instead of just `ribs`) so that we obtain access to the [`ribs.visualize`](https://docs.pyribs.org/en/stable/api/ribs.visualize.html) module." + "First, let's install pyribs and Gymnasium. [Gymnasium](https://gymnasium.farama.org) is the successor to [OpenAI Gym](https://www.gymlibrary.dev), which was deprecated in late 2022. We use the visualize extra of pyribs (`ribs[visualize]` instead of just `ribs`) so that we obtain access to the [`ribs.visualize`](https://docs.pyribs.org/en/latest/api/ribs.visualize.html) module." ] }, { @@ -346,7 +346,7 @@ "source": [ "### GridArchive\n", "\n", - "First, the [`GridArchive`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html) stores solutions (i.e. models for our policy) in a rectangular grid. Each dimension of the `GridArchive` corresponds to a dimension in measure space that is segmented into equally sized cells. As we have two measure functions for our lunar lander, we have two dimensions in the `GridArchive`. The first dimension is the impact $x$-position, which ranges from -1 to 1, and the second is the impact $y$-velocity, which ranges from -3 (smashing into the ground) to 0 (gently touching down). We divide both dimensions into 50 cells.\n", + "First, the [`GridArchive`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html) stores solutions (i.e. models for our policy) in a rectangular grid. Each dimension of the `GridArchive` corresponds to a dimension in measure space that is segmented into equally sized cells. As we have two measure functions for our lunar lander, we have two dimensions in the `GridArchive`. The first dimension is the impact $x$-position, which ranges from -1 to 1, and the second is the impact $y$-velocity, which ranges from -3 (smashing into the ground) to 0 (gently touching down). We divide both dimensions into 50 cells.\n", "\n", "We additionally specify the dimensionality of solutions which will be stored in the archive. While each model is a 2D matrix, pyribs archives only allow 1D arrays for efficiency; hence, we create an `initial_model` below and retrieve the size of its flattened, 1D form with `initial_model.size`." ] @@ -393,7 +393,7 @@ "source": [ "### EvolutionStrategyEmitter\n", "\n", - "Next, the [`EvolutionStrategyEmitter`](https://docs.pyribs.org/en/stable/api/ribs.emitters.EvolutionStrategyEmitter.html) with two-stage improvement ranking (\"2imp\") uses CMA-ES to search for policies that add new entries to the archive or improve existing ones. Since we do not have any prior knowledge of what the model will be, we set the initial model to be the zero vector, and we set the initial step size for CMA-ES to be 1.0, so that initial solutions are sampled from a standard isotropic Gaussian. Furthermore, we use 5 emitters so that the algorithm simultaneously searches several areas of the measure space." + "Next, the [`EvolutionStrategyEmitter`](https://docs.pyribs.org/en/latest/api/ribs.emitters.EvolutionStrategyEmitter.html) with two-stage improvement ranking (\"2imp\") uses CMA-ES to search for policies that add new entries to the archive or improve existing ones. Since we do not have any prior knowledge of what the model will be, we set the initial model to be the zero vector, and we set the initial step size for CMA-ES to be 1.0, so that initial solutions are sampled from a standard isotropic Gaussian. Furthermore, we use 5 emitters so that the algorithm simultaneously searches several areas of the measure space." ] }, { @@ -432,9 +432,9 @@ "> 1. `status`: Whether the solution creates a new cell in the archive, improves an existing cell, or is not inserted.\n", "> 2. `value`: Consider the objective $f$ of the solution and the objective $f'$ of the solution currently in the cell where the solution will be inserted. When the solution creates a new cell, $f'$ is undefined because the cell was previously empty, so the value is defined as $f$. Otherwise, when the solution improves an existing cell or is not inserted at all, the value is $f - f'$.\n", ">\n", - "> During ranking, this two-stage improvement ranker will first sort by `status`, prioritizing new solutions, followed by solutions which improve existing cells, followed by solutions which are not inserted. Within each group, solutions are further ranked by their corresponding `value`. See the archive [`add`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.add) method for more information on statuses and values.\n", + "> During ranking, this two-stage improvement ranker will first sort by `status`, prioritizing new solutions, followed by solutions which improve existing cells, followed by solutions which are not inserted. Within each group, solutions are further ranked by their corresponding `value`. See the archive [`add`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.add) method for more information on statuses and values.\n", ">\n", - "> Additional rankers are available in the [ribs.emitters.rankers](https://docs.pyribs.org/en/stable/api/ribs.emitters.rankers.html) module. These rankers include those corresponding to different emitters described in the CMA-ME paper." + "> Additional rankers are available in the [ribs.emitters.rankers](https://docs.pyribs.org/en/latest/api/ribs.emitters.rankers.html) module. These rankers include those corresponding to different emitters described in the CMA-ME paper." ] }, { @@ -445,7 +445,7 @@ "source": [ "### Scheduler\n", "\n", - "Finally, the [`Scheduler`](https://docs.pyribs.org/en/stable/api/ribs.schedulers.Scheduler.html) controls how the emitters interact with the archive. On every iteration, the scheduler calls the emitters to generate solutions. After the user evaluates these generated solutions, the scheduler inserts the solutions into the archive and passes the feedback to the emitters (this feedback consists of the `status` and `value` for each solution that we described in the note above)." + "Finally, the [`Scheduler`](https://docs.pyribs.org/en/latest/api/ribs.schedulers.Scheduler.html) controls how the emitters interact with the archive. On every iteration, the scheduler calls the emitters to generate solutions. After the user evaluates these generated solutions, the scheduler inserts the solutions into the archive and passes the feedback to the emitters (this feedback consists of the `status` and `value` for each solution that we described in the note above)." ] }, { @@ -469,7 +469,7 @@ "source": [ "## QD Search\n", "\n", - "With the pyribs components defined, we start searching with CMA-ME. Since we use 5 emitters each with a batch size of 30 and we run 300 iterations, we run 5 x 30 x 300 = 45,000 lunar lander simulations. We also keep track of some logging info via `archive.stats`, which is an [`ArchiveStats`](https://docs.pyribs.org/en/stable/api/ribs.archives.ArchiveStats.html) object.\n", + "With the pyribs components defined, we start searching with CMA-ME. Since we use 5 emitters each with a batch size of 30 and we run 300 iterations, we run 5 x 30 x 300 = 45,000 lunar lander simulations. We also keep track of some logging info via `archive.stats`, which is an [`ArchiveStats`](https://docs.pyribs.org/en/latest/api/ribs.archives.ArchiveStats.html) object.\n", "\n", "Since it takes a relatively long time to evaluate a lunar lander solution, we parallelize the evaluation of multiple solutions with [Dask](https://distributed.dask.org/en/stable/quickstart.html). On Colab, with one worker (i.e., one CPU), the loop should take **~30 minutes** to run. With two workers, it should take **~20 minutes** to run. Feel free to increase the number of workers based on the number of CPUs your system has available to speed up the loop further." ] @@ -630,7 +630,7 @@ "source": [ "## Visualizing the Archive\n", "\n", - "Using [`grid_archive_heatmap`](https://docs.pyribs.org/en/stable/api/ribs.visualize.grid_archive_heatmap.html) from the [`ribs.visualize`](https://docs.pyribs.org/en/stable/api/ribs.visualize.html) module, we can view a heatmap of the archive. The heatmap shows the measures for which CMA-ME found a solution. The color of each cell shows the objective value of the solution." + "Using [`grid_archive_heatmap`](https://docs.pyribs.org/en/latest/api/ribs.visualize.grid_archive_heatmap.html) from the [`ribs.visualize`](https://docs.pyribs.org/en/latest/api/ribs.visualize.html) module, we can view a heatmap of the archive. The heatmap shows the measures for which CMA-ME found a solution. The color of each cell shows the objective value of the solution." ] }, { @@ -747,7 +747,7 @@ }, "source": [ "\n", - "We can retrieve policies with measures that are close to a query with the [`retrieve_single`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.retrieve_single) method. This method will look up the cell corresponding to the queried measures. Then, the method will check if there is an elite in that cell, and return the elite if it exists (the method does not check neighboring cells for elites). The returned elite may not have the exact measures requested because the elite only has to be in the same cell as the queried measures.\n", + "We can retrieve policies with measures that are close to a query with the [`retrieve_single`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.retrieve_single) method. This method will look up the cell corresponding to the queried measures. Then, the method will check if there is an elite in that cell, and return the elite if it exists (the method does not check neighboring cells for elites). The returned elite may not have the exact measures requested because the elite only has to be in the same cell as the queried measures.\n", "\n", "Below, we first retrieve a policy that impacted the ground on the left (approximately -0.4) with low velocity (approximately -0.10) by querying for `[-0.4, -0.10]`." ] @@ -808,7 +808,7 @@ "\n", "**Note: Batch and Single Methods**\n", "\n", - "> `retrieve_single` returns an [Elite](https://docs.pyribs.org/en/stable/api/ribs.archives.Elite.html) object given a single `measures` array. Meanwhile, the [`retrieve`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.retrieve) method takes in a _batch_ of measures (named `measures_batch`) and returns an [EliteBatch](https://docs.pyribs.org/en/stable/api/ribs.archives.EliteBatch.html) object. Several archive methods in pyribs follow a similar pattern of having a batch and single version, e.g., [`add`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.add) and [`add_single`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.add_single)." + "> `retrieve_single` returns an [Elite](https://docs.pyribs.org/en/latest/api/ribs.archives.Elite.html) object given a single `measures` array. Meanwhile, the [`retrieve`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.retrieve) method takes in a _batch_ of measures (named `measures_batch`) and returns an [EliteBatch](https://docs.pyribs.org/en/latest/api/ribs.archives.EliteBatch.html) object. Several archive methods in pyribs follow a similar pattern of having a batch and single version, e.g., [`add`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.add) and [`add_single`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.add_single)." ] }, { @@ -914,7 +914,7 @@ "id": "MZLDoutPFKhu" }, "source": [ - "As the archive has ~2500 solutions, we cannot view them all, but we can filter for high-performing solutions. We first retrieve the archive's elites with the [`as_pandas`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.as_pandas) method. Then, we choose solutions that scored above 200 because 200 is the [threshold for the problem to be considered solved](https://gymnasium.farama.org/environments/box2d/lunar_lander/). Note that many high-performing solutions do not land on the landing pad." + "As the archive has ~2500 solutions, we cannot view them all, but we can filter for high-performing solutions. We first retrieve the archive's elites with the [`as_pandas`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.as_pandas) method. Then, we choose solutions that scored above 200 because 200 is the [threshold for the problem to be considered solved](https://gymnasium.farama.org/environments/box2d/lunar_lander/). Note that many high-performing solutions do not land on the landing pad." ] }, { @@ -935,7 +935,7 @@ "id": "0_RYE1rTFKhu" }, "source": [ - "Below we visualize several of these high-performing solutions. The `iterelites` method is available because `as_pandas` returns an [`ArchiveDataFrame`](https://docs.pyribs.org/en/stable/api/ribs.archives.ArchiveDataFrame.html), a subclass of the Pandas DataFrame specialized for pyribs. `iterelites` iterates over the entries in the DataFrame and returns them as [`Elite`](https://docs.pyribs.org/en/stable/api/ribs.archives.Elite.html) objects." + "Below we visualize several of these high-performing solutions. The `iterelites` method is available because `as_pandas` returns an [`ArchiveDataFrame`](https://docs.pyribs.org/en/latest/api/ribs.archives.ArchiveDataFrame.html), a subclass of the Pandas DataFrame specialized for pyribs. `iterelites` iterates over the entries in the DataFrame and returns them as [`Elite`](https://docs.pyribs.org/en/latest/api/ribs.archives.Elite.html) objects." ] }, { @@ -1034,7 +1034,7 @@ "id": "P3cQJ2ctOBG5" }, "source": [ - "And finally, the [`best_elite`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.best_elite) property is the [`Elite`](https://docs.pyribs.org/en/stable/api/ribs.archives.Elite.html) which has the highest performance in the archive." + "And finally, the [`best_elite`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html#ribs.archives.GridArchive.best_elite) property is the [`Elite`](https://docs.pyribs.org/en/latest/api/ribs.archives.Elite.html) which has the highest performance in the archive." ] }, { @@ -1098,9 +1098,9 @@ "- Try increasing the number of iterations. Will CMA-ME create a better archive if it gets to evaluate more solutions?\n", "- Use other gym environments. What measures could you use in an environment like `BipedalWalker-v2`?\n", "\n", - "Finally, to learn about an algorithm which performs even better on QD problems, check out the [CMA-MAE tutorial](https://docs.pyribs.org/en/stable/tutorials/cma_mae.html).\n", + "Finally, to learn about an algorithm which performs even better on QD problems, check out the [CMA-MAE tutorial](https://docs.pyribs.org/en/latest/tutorials/cma_mae.html).\n", "\n", - "And for a version of this tutorial that uses [Dask](https://dask.org) to parallelize evaluations and offers features like a command-line interface, refer to the [Lunar Lander example](https://docs.pyribs.org/en/stable/examples/lunar_lander.html)." + "And for a version of this tutorial that uses [Dask](https://dask.org) to parallelize evaluations and offers features like a command-line interface, refer to the [Lunar Lander example](https://docs.pyribs.org/en/latest/examples/lunar_lander.html)." ] }, { diff --git a/tutorials/tom_cruise_dqd.ipynb b/tutorials/tom_cruise_dqd.ipynb index c2f427948..36d37ceb4 100644 --- a/tutorials/tom_cruise_dqd.ipynb +++ b/tutorials/tom_cruise_dqd.ipynb @@ -10,7 +10,7 @@ "\n", "_Lights! Camera! Action!_ 📸\n", "\n", - "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/stable/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", + "_This tutorial is part of the series of pyribs tutorials! See [here](https://docs.pyribs.org/en/latest/tutorials.html) for the list of all tutorials and the order in which they should be read._\n", "\n", "In recent years, there has been an explosion in AI's ability to understand and manipulate images and text. In particular, with the rise of [Generative Adversarial Networks (GANs)](https://en.wikipedia.org/wiki/Generative_adversarial_network) like [StyleGAN](https://en.wikipedia.org/wiki/StyleGAN), AI can now generate fake yet highly realistic images. Incredibly, even something as complex as a human face can now be reproduced with ease, as demonstrated by websites like [this person does not exist](https://thispersondoesnotexist.com) and [which face is real](https://www.whichfaceisreal.com/index.php).\n", "\n", @@ -533,7 +533,7 @@ "source": [ "## CMA-MAEGA with pyribs\n", "\n", - "To search for images of Tom Cruise, we will use Covariance Matrix Adaptation MAP-Annealing via a Gradient Arborescence (CMA-MAEGA) (please refer to [Fontaine 2021](https://arxiv.org/abs/2106.03894) and [Fontaine 2022](https://arxiv.org/abs/2205.10752) if you are not yet familiar with CMA-MAEGA). Similar to CMA-ME (see the [lunar lander tutorial](https://docs.pyribs.org/en/stable/tutorials/lunar_lander.html)) and CMA-MAE (see the [CMA-MAE tutorial](https://docs.pyribs.org/en/stable/tutorials/cma_mae.html)), CMA-MAEGA requires a `GridArchive` and `Scheduler`, but while CMA-ME and CMA-MAE use an `EvolutionStrategyEmitter`, CMA-MAEGA uses a `GradientArborescenceEmitter`." + "To search for images of Tom Cruise, we will use Covariance Matrix Adaptation MAP-Annealing via a Gradient Arborescence (CMA-MAEGA) (please refer to [Fontaine 2021](https://arxiv.org/abs/2106.03894) and [Fontaine 2022](https://arxiv.org/abs/2205.10752) if you are not yet familiar with CMA-MAEGA). Similar to CMA-ME (see the [lunar lander tutorial](https://docs.pyribs.org/en/latest/tutorials/lunar_lander.html)) and CMA-MAE (see the [CMA-MAE tutorial](https://docs.pyribs.org/en/latest/tutorials/cma_mae.html)), CMA-MAEGA requires a `GridArchive` and `Scheduler`, but while CMA-ME and CMA-MAE use an `EvolutionStrategyEmitter`, CMA-MAEGA uses a `GradientArborescenceEmitter`." ] }, { @@ -544,7 +544,7 @@ "source": [ "### GridArchive\n", "\n", - "First, we create the [`GridArchive`](https://docs.pyribs.org/en/stable/api/ribs.archives.GridArchive.html). The archive is 200x200 and stores StyleGAN latent vectors which are 7,168-dimensional.\n", + "First, we create the [`GridArchive`](https://docs.pyribs.org/en/latest/api/ribs.archives.GridArchive.html). The archive is 200x200 and stores StyleGAN latent vectors which are 7,168-dimensional.\n", "\n", "The `ranges` (i.e. bounds of the measure space) effectively control the solution space from which our pipeline generates images. Widening the ranges will result in generating more exotic images, while narrowing the ranges will focus the search into a tight window.\n", "\n", @@ -658,7 +658,7 @@ "source": [ "### Scheduler\n", "\n", - "Finally, the [`Scheduler`](https://docs.pyribs.org/en/stable/api/ribs.schedulers.Scheduler.html) controls how the emitters interact with the archive and result archive. On every iteration, the scheduler calls the emitters to generate solutions. After the user evaluates these generated solutions, the scheduler inserts the solutions into both the archive and result archive. It then passes feedback from the archive (but not the result archive) to the emitters. In this manner, the emitters only interact with the archive, but the result archive stores all the best solutions found by CMA-MAEGA." + "Finally, the [`Scheduler`](https://docs.pyribs.org/en/latest/api/ribs.schedulers.Scheduler.html) controls how the emitters interact with the archive and result archive. On every iteration, the scheduler calls the emitters to generate solutions. After the user evaluates these generated solutions, the scheduler inserts the solutions into both the archive and result archive. It then passes feedback from the archive (but not the result archive) to the emitters. In this manner, the emitters only interact with the archive, but the result archive stores all the best solutions found by CMA-MAEGA." ] }, { @@ -682,7 +682,7 @@ "source": [ "### Summary: Differences from CMA-MEGA\n", "\n", - "Similar to how CMA-MAE builds on CMA-ME, CMA-MAEGA builds on Covariance Matrix Adaptation MAP-Elites via a Gradient Arborescence (CMA-MEGA) introduced in [Fontaine et al., 2021](https://arxiv.org/abs/2106.03894). Implementation-wise, these are the differences between CMA-MAEGA and CMA-MEGA in pyribs. They mirror the differences between CMA-MAE and CMA-ME in the [CMA-MAE tutorial](https://docs.pyribs.org/en/stable/tutorials/cma_mae.html).\n", + "Similar to how CMA-MAE builds on CMA-ME, CMA-MAEGA builds on Covariance Matrix Adaptation MAP-Elites via a Gradient Arborescence (CMA-MEGA) introduced in [Fontaine et al., 2021](https://arxiv.org/abs/2106.03894). Implementation-wise, these are the differences between CMA-MAEGA and CMA-MEGA in pyribs. They mirror the differences between CMA-MAE and CMA-ME in the [CMA-MAE tutorial](https://docs.pyribs.org/en/latest/tutorials/cma_mae.html).\n", "\n", "* The archive (we used `GridArchive` but you can also use another archive) takes in a `learning_rate` and `threshold_min` parameter. The `learning_rate` is between 0 and 1 (inclusive), and the `threshold_min` typically corresponds to the minimum objective of the problem.\n", "* A second result archive is created to store best solutions, as introducing thresholds means that the first archive is not guaranteed to store the best solutions. This archive is identical to the first, but it does not have `learning_rate` or `threshold_min` parameters.\n", @@ -857,7 +857,7 @@ "source": [ "## Visualizing the Archive\n", "\n", - "To get a sense of where the solutions lie in the archive, we can plot a heatmap with [`grid_archive_heatmap`](https://docs.pyribs.org/en/stable/api/ribs.visualize.grid_archive_heatmap.html). We see that our solutions cover an almost circular region in the center of the archive." + "To get a sense of where the solutions lie in the archive, we can plot a heatmap with [`grid_archive_heatmap`](https://docs.pyribs.org/en/latest/api/ribs.visualize.grid_archive_heatmap.html). We see that our solutions cover an almost circular region in the center of the archive." ] }, {