Merge pull request #20 from sgbaird/dev

Add new toggleable options - Attach existing data - Categorical variables - Custom threshold - Single vs. batch optimization
sgbaird · Feb 24, 2024 · e53c6eb · e53c6eb
2 parents dd31dd4 + bd0e42d
commit e53c6eb
Showing 187 changed files with 14,494 additions and 1,567 deletions.
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -2,7 +2,7 @@ exclude: '^docs/conf.py'
 
 repos:
 - repo: https://github.com/pre-commit/pre-commit-hooks
-  rev: v4.4.0
+  rev: v4.5.0
   hooks:
   - id: trailing-whitespace
   - id: check-added-large-files
@@ -27,7 +27,7 @@ repos:
 
 # If you want to avoid flake8 errors due to unused vars or imports:
 - repo: https://github.com/PyCQA/autoflake
-  rev: v2.2.0
+  rev: v2.2.1
   hooks:
   - id: autoflake
     args: [
@@ -37,19 +37,19 @@ repos:
     ]
 
 - repo: https://github.com/PyCQA/isort
-  rev: 5.12.0
+  rev: 5.13.2
   hooks:
   - id: isort
 
 - repo: https://github.com/psf/black
-  rev: 23.7.0
+  rev: 24.1.1
   hooks:
   - id: black
     exclude: ^tests/generated_scripts/
     language_version: python3
 
 - repo: https://github.com/psf/black
-  rev: 23.7.0
+  rev: 24.1.1
   hooks:
   - id: black
     exclude: ^(?!tests/generated_scripts/).* # any string that does not start with tests/generated_scripts/
@@ -58,15 +58,15 @@ repos:
 
 # If like to embrace black styles even in the docs:
 - repo: https://github.com/asottile/blacken-docs
-  rev: 1.15.0
+  rev: 1.16.0
   hooks:
   - id: blacken-docs
     additional_dependencies: [black]
     exclude: ^reports/copilot-chat/
 
 
 - repo: https://github.com/PyCQA/flake8
-  rev: 6.1.0
+  rev: 7.0.0
   hooks:
   - id: flake8
     exclude: ^tests/generated_scripts/

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -106,14 +106,23 @@ python3 -m http.server --directory 'docs/_build/html'
 For a high-level roadmap of Honegumi's development, see https://github.com/sgbaird/honegumi/discussions/2. Honegumi uses Python, Javascript, Jinja2, pytest, and GitHub actions to automate the generation, testing, and deployment of templates with a focus on Bayesian optimization packages. As of 2023-08-21, only a single package ([Meta's Ax Platform](https://ax.dev) for a small set of features. However, the plumbing and logic that creates this is thorough and scalable. I focused first on getting all the pieces together before scaling up to many features (and thus slowing down the development cycle).
 
 Here are some ways you can help with the project:
-1. Use the tool and let me know what you think 😉
+1. Use the tool and let us know what you think 😉
 2. [Provide feedback](https://github.com/sgbaird/honegumi/discussions/2) on the overall organization, logic, and workflow of the project
 3. Extend the Ax features to additional options (i.e., additional rows and options within rows) via direct edits to [ax/main.py.jinja](https://github.com/sgbaird/honegumi/blob/main/src/honegumi/ax/main.py.jinja)
-4. Extend Honegumi to additional platforms such as BoFire or Atlas
-5. Spread the word about the tool
+4. Improve the `honegumi.html` and `honegumi.ipynb` templates (may also need to update `generate_scripts.py`). See below for more information.
+5. Extend Honegumi to additional platforms such as BoFire or Atlas
+6. Spread the word about the tool
 
 For those unfamiliar with Jinja2, see the Google Colab tutorial: [_A Gentle Introduction to Jinja2_](https://colab.research.google.com/github/sgbaird/honegumi/blob/main/notebooks/1.0-sgb-gentle-introduction-jinja.ipynb). The main template file for Meta's Adaptive Experimentation (Ax) Platform is [`ax/main.py.jinja`](https://github.com/sgbaird/honegumi/blob/main/src/honegumi/ax/main.py.jinja). The main file that interacts with this template is at [`scripts/generate_scripts.py`](https://github.com/sgbaird/honegumi/blob/main/scripts/generate_scripts.py). The generated scripts are [available on GitHub](https://github.com/sgbaird/honegumi/tree/main/docs/generated_scripts/ax). Each script is tested [via `pytest`](https://github.com/sgbaird/honegumi/tree/main/tests) and [GitHub Actions](https://github.com/sgbaird/honegumi/actions/workflows/ci.yml) to ensure it can run error-free. Finally, the results are passed to [core/honegumi.html.jinja](https://github.com/sgbaird/honegumi/blob/main/src/honegumi/core/honegumi.html.jinja) and [core/honegumi.ipynb.jinja](https://github.com/sgbaird/honegumi/blob/main/src/honegumi/core/honegumi.ipynb.jinja) to create the scripts and notebooks, respectively.
 
+NOTE: If you are committing some of the generated scripts or notebooks on Windows, you will [likely need to run this command](https://stackoverflow.com/questions/22575662/filename-too-long-in-git-for-windows) in a terminal (e.g., git bash) as an administrator to avoid an `lstat(...) Filename too long` error:
+
+```bash
+git config --system core.longpaths true
+```
+
+If working in GitHub Desktop, you will likely need to follow [these instructions](https://stackoverflow.com/a/74289583/13697228).
+
 ## Project Organization
 
 ```

diff --git a/README.md b/README.md
@@ -19,8 +19,7 @@
 
 > Honegumi (骨組み, pronounced "ho neh goo mee"), which means "skeletal framework" in Japanese, is a package for
 > interactively creating API tutorials with a focus on optimization packages such as Meta's Ax
-> Platform. We are looking for contributors!
-> Start by taking a look at our contribution guide.
+> Platform. We are [looking for contributors](https://github.com/sgbaird/honegumi/blob/main/CONTRIBUTING.md)!
 
 <!-- > Unlock the power of advanced optimization in materials science with Honegumi (骨組み,
 > pronounced "ho-neh-goo-mee"), our interactive "skeleton code" generator. -->
@@ -29,12 +28,14 @@
 
 Real-world materials science optimization tasks are complex! To cite a few examples:
 
-- The measurements are **noisy**
-- Some measurements are higher quality but much more costly (**multi-fidelity**)
-- Almost always, tasks have multiple properties that are important (**multi-objective**)
-- Like finding the proverbial "needle-in-a-haystack", the search spaces are enormous (**high-dimensional**)
-- Not all combinations of parameters are valid (i.e., **constraints**)
-- Often there is a mixture of numerical and categorical variables (**mixed-variable**)
+| Topic           | Description |
+| --------------- | ----------- |
+| Noise           | Repeat measurements are stochastic |
+| Multi-fidelity  | Some measurements are higher quality but much more costly |
+| Multi-objective | Almost always, tasks have multiple properties that are important |
+| High-dimensional| Like finding the proverbial "needle-in-a-haystack", the search spaces are enormous |
+| Constraints     | Not all combinations of parameters are valid (i.e., constraints) |
+| Mixed-variable  | Often there is a mixture of numerical and categorical variables |
 
 However, applications of state-of-the-art algorithms to these materials science tasks have been limited. Meta's Adaptive Experimentation (Ax) platform is one of the few optimization platforms capable of handling these challenges without oversimplification. While Ax and its backbone, BoTorch, have gained traction in chemistry and materials science, advanced implementations are still challenging, even for veteran materials informatics practitioners. In addition to combining multiple algorithms, there are other logistical issues, such as using existing data, embedding physical descriptors, and modifying search spaces. To address these challenges, we present Honegumi (骨組み or "ho-neh-goo-mee"): An interactive "skeleton code" generator for materials-relevant optimization. Similar to [PyTorch's installation docs](https://pytorch.org/get-started/locally/), users interactively select advanced topics to generate robust templates that are unit-tested with invalid configurations crossed out. Honegumi is the first Bayesian optimization template generator of its kind, and we envision that this tool will reduce the barrier to entry for applying advanced Bayesian optimization to real-world materials science tasks.
 
@@ -43,7 +44,7 @@ However, applications of state-of-the-art algorithms to these materials science
 You don't need to install anything. Just navigate to https://honegumi.readthedocs.io/, select the desired options, and click the "Open in Colab" badge.
 
 If you're interested in collaborating, see [the contribution
-guidelines](https://github.com/sgbaird/honegumi/blob/main/CONTRIBUTING.md) or get in touch.
+guidelines](https://github.com/sgbaird/honegumi/blob/main/CONTRIBUTING.md) and [the high-level roadmap of Honegumi's development](https://github.com/sgbaird/honegumi/discussions/2).
 
 ## License
 

diff --git a/docs/conf.py b/docs/conf.py
@@ -75,6 +75,7 @@
     "sphinx.ext.mathjax",
     "sphinx.ext.napoleon",
     "sphinx_rtd_theme",
+    # "sphinx_rtd_dark_mode", # Honegumi table looks strange with dark mode due to custom html
 ]
 
 # Add any paths that contain templates here, relative to this directory.
@@ -179,7 +180,7 @@
 # Theme options are theme-specific and customize the look and feel of a theme
 # further.  For a list of options available for each theme, see the
 # documentation.
-html_theme_options = {"sidebar_width": "300px", "page_width": "1200px"}
+# html_theme_options = {"sidebar_width": "300px", "page_width": "1200px"}
 
 # Add any paths that contain custom themes here, relative to this directory.
 # html_theme_path = []

diff --git a/...mposition_constraint-False+categorical-False+custom_threshold-False+synchrony-batch.ipynb b/...mposition_constraint-False+categorical-False+custom_threshold-False+synchrony-batch.ipynb
@@ -0,0 +1,116 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1e89e727",
+   "metadata": {},
+   "source": [
+    "<a href=\"https://colab.research.google.com/github/sgbaird/honegumi/blob/main/docs\\generated_notebooks\\ax\\objective-multi+model-FULLYBAYESIAN+custom_gen-True+existing_data-False+sum_constraint-False+order_constraint-False+linear_constraint-False+composition_constraint-False+categorical-False+custom_threshold-False+synchrony-batch.ipynb\"><img alt=\"Open In Colab\" src=\"https://colab.research.google.com/assets/colab-badge.svg\"></a>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2e4a348b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install ax-platform"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c69bdeb0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "from ax.service.ax_client import AxClient, ObjectiveProperties\n",
+    "\n",
+    "from ax.modelbridge.factory import Models\n",
+    "from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy\n",
+    "\n",
+    "\n",
+    "obj1_name = \"branin\"\n",
+    "obj2_name = \"branin_swapped\"\n",
+    "\n",
+    "\n",
+    "def branin_moo(x1, x2):\n",
+    "    y = float(\n",
+    "        (x2 - 5.1 / (4 * np.pi**2) * x1**2 + 5.0 / np.pi * x1 - 6.0) ** 2\n",
+    "        + 10 * (1 - 1.0 / (8 * np.pi)) * np.cos(x1)\n",
+    "        + 10\n",
+    "    )\n",
+    "\n",
+    "    # second objective has x1 and x2 swapped\n",
+    "    y2 = float(\n",
+    "        (x1 - 5.1 / (4 * np.pi**2) * x2**2 + 5.0 / np.pi * x2 - 6.0) ** 2\n",
+    "        + 10 * (1 - 1.0 / (8 * np.pi)) * np.cos(x2)\n",
+    "        + 10\n",
+    "    )\n",
+    "\n",
+    "    return {obj1_name: y, obj2_name: y2}\n",
+    "\n",
+    "\n",
+    "gs = GenerationStrategy(\n",
+    "    steps=[\n",
+    "        GenerationStep(\n",
+    "            model=Models.SOBOL,\n",
+    "            num_trials=4,  # https://github.com/facebook/Ax/issues/922\n",
+    "            min_trials_observed=3,\n",
+    "            max_parallelism=5,\n",
+    "            model_kwargs={\"seed\": 999},\n",
+    "            model_gen_kwargs={},\n",
+    "        ),\n",
+    "        GenerationStep(\n",
+    "            model=Models.FULLYBAYESIANMOO,\n",
+    "            num_trials=-1,\n",
+    "            max_parallelism=3,\n",
+    "            model_kwargs={\"num_samples\": 256, \"warmup_steps\": 512},\n",
+    "        ),\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "ax_client = AxClient(generation_strategy=gs)\n",
+    "\n",
+    "ax_client.create_experiment(\n",
+    "    parameters=[\n",
+    "        {\"name\": \"x1\", \"type\": \"range\", \"bounds\": [-5.0, 10.0]},\n",
+    "        {\"name\": \"x2\", \"type\": \"range\", \"bounds\": [0.0, 10.0]},\n",
+    "    ],\n",
+    "    objectives={\n",
+    "        obj1_name: ObjectiveProperties(minimize=True),\n",
+    "        obj2_name: ObjectiveProperties(minimize=True),\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "\n",
+    "batch_size = 2\n",
+    "\n",
+    "\n",
+    "for _ in range(19):\n",
+    "\n",
+    "    parameterizations, optimization_complete = ax_client.get_next_trials(batch_size)\n",
+    "    for trial_index, parameterization in list(parameterizations.items()):\n",
+    "        # extract parameters\n",
+    "        x1 = parameterization[\"x1\"]\n",
+    "        x2 = parameterization[\"x2\"]\n",
+    "\n",
+    "        results = branin_moo(x1, x2)\n",
+    "        ax_client.complete_trial(trial_index=trial_index, raw_data=results)\n",
+    "\n",
+    "pareto_results = ax_client.get_pareto_optimal_parameters()"
+   ]
+  }
+ ],
+ "metadata": {
+  "jupytext": {
+   "cell_metadata_filter": "-all",
+   "main_language": "python",
+   "notebook_metadata_filter": "-all"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/...position_constraint-False+categorical-False+custom_threshold-False+synchrony-single.ipynb b/...position_constraint-False+categorical-False+custom_threshold-False+synchrony-single.ipynb
@@ -0,0 +1,113 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "6085c430",
+   "metadata": {},
+   "source": [
+    "<a href=\"https://colab.research.google.com/github/sgbaird/honegumi/blob/main/docs\\generated_notebooks\\ax\\objective-multi+model-FULLYBAYESIAN+custom_gen-True+existing_data-False+sum_constraint-False+order_constraint-False+linear_constraint-False+composition_constraint-False+categorical-False+custom_threshold-False+synchrony-single.ipynb\"><img alt=\"Open In Colab\" src=\"https://colab.research.google.com/assets/colab-badge.svg\"></a>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b34b6676",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install ax-platform"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0f1868d4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "from ax.service.ax_client import AxClient, ObjectiveProperties\n",
+    "\n",
+    "from ax.modelbridge.factory import Models\n",
+    "from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy\n",
+    "\n",
+    "\n",
+    "obj1_name = \"branin\"\n",
+    "obj2_name = \"branin_swapped\"\n",
+    "\n",
+    "\n",
+    "def branin_moo(x1, x2):\n",
+    "    y = float(\n",
+    "        (x2 - 5.1 / (4 * np.pi**2) * x1**2 + 5.0 / np.pi * x1 - 6.0) ** 2\n",
+    "        + 10 * (1 - 1.0 / (8 * np.pi)) * np.cos(x1)\n",
+    "        + 10\n",
+    "    )\n",
+    "\n",
+    "    # second objective has x1 and x2 swapped\n",
+    "    y2 = float(\n",
+    "        (x1 - 5.1 / (4 * np.pi**2) * x2**2 + 5.0 / np.pi * x2 - 6.0) ** 2\n",
+    "        + 10 * (1 - 1.0 / (8 * np.pi)) * np.cos(x2)\n",
+    "        + 10\n",
+    "    )\n",
+    "\n",
+    "    return {obj1_name: y, obj2_name: y2}\n",
+    "\n",
+    "\n",
+    "gs = GenerationStrategy(\n",
+    "    steps=[\n",
+    "        GenerationStep(\n",
+    "            model=Models.SOBOL,\n",
+    "            num_trials=4,  # https://github.com/facebook/Ax/issues/922\n",
+    "            min_trials_observed=3,\n",
+    "            max_parallelism=5,\n",
+    "            model_kwargs={\"seed\": 999},\n",
+    "            model_gen_kwargs={},\n",
+    "        ),\n",
+    "        GenerationStep(\n",
+    "            model=Models.FULLYBAYESIANMOO,\n",
+    "            num_trials=-1,\n",
+    "            max_parallelism=3,\n",
+    "            model_kwargs={\"num_samples\": 256, \"warmup_steps\": 512},\n",
+    "        ),\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "ax_client = AxClient(generation_strategy=gs)\n",
+    "\n",
+    "ax_client.create_experiment(\n",
+    "    parameters=[\n",
+    "        {\"name\": \"x1\", \"type\": \"range\", \"bounds\": [-5.0, 10.0]},\n",
+    "        {\"name\": \"x2\", \"type\": \"range\", \"bounds\": [0.0, 10.0]},\n",
+    "    ],\n",
+    "    objectives={\n",
+    "        obj1_name: ObjectiveProperties(minimize=True),\n",
+    "        obj2_name: ObjectiveProperties(minimize=True),\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "\n",
+    "for _ in range(19):\n",
+    "\n",
+    "    parameterization, trial_index = ax_client.get_next_trial()\n",
+    "\n",
+    "    # extract parameters\n",
+    "    x1 = parameterization[\"x1\"]\n",
+    "    x2 = parameterization[\"x2\"]\n",
+    "\n",
+    "    results = branin_moo(x1, x2)\n",
+    "    ax_client.complete_trial(trial_index=trial_index, raw_data=results)\n",
+    "\n",
+    "pareto_results = ax_client.get_pareto_optimal_parameters()"
+   ]
+  }
+ ],
+ "metadata": {
+  "jupytext": {
+   "cell_metadata_filter": "-all",
+   "main_language": "python",
+   "notebook_metadata_filter": "-all"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}