Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up 2D cvt_archive_heatmap by order of magnitude #355

Merged
merged 5 commits into from
Sep 7, 2023

Conversation

btjanaka
Copy link
Member

@btjanaka btjanaka commented Sep 7, 2023

Description

Currently, cvt_archive_heatmap plots individual polygons via ax.fill . We can speed this up by instead using a PolyCollection to add all the polygons at once. This is similar to using a PatchCollection as shown here: https://matplotlib.org/stable/gallery/shapes_and_collections/patch_collection.html.

Benchmark for plotting a CVTArchive with 10,000 cells:

  • Before: 14.9 sec
  • After: 0.6 sec

I used the following code to benchmark the implementation:

"""Driver for cvt heatmap experiments."""

import time

import fire
import matplotlib.pyplot as plt
import numpy as np

from ribs.archives import CVTArchive
from ribs.visualize import cvt_archive_heatmap


def main(n_cells=10000):
    """Creates the archive and plots it."""
    np.random.seed(42)

    archive = CVTArchive(
        solution_dim=3,
        cells=n_cells,
        ranges=[(-1, 1), (-1, 1)],
        custom_centroids=np.random.uniform(-1, 1, (n_cells, 2)),
    )

    archive.add(
        np.random.uniform(-1, 1, (20000, 3)),
        np.random.standard_normal(20000),
        np.random.uniform(-1, 1, (20000, 2)),
    )

    plt.figure(figsize=(8, 6))

    start_time = time.time()
    cvt_archive_heatmap(archive)
    print("Plot time", time.time() - start_time)

    plt.savefig("cvt.png")


if __name__ == "__main__":
    fire.Fire(main)

TODO

  • Speed up 2D polygon plotting by using matplotlib PolyCollection — note that I initially used a PatchCollection with individual Polygon patches, but PolyCollection is much faster because we do not have to construct the individual Polygon patches in Python.
  • Compute facecolors in a batch instead of individually
  • Fix test errors — it seems the images changed slightly due to the new implementation, so we now allow a slight tolerance for cvt heatmap images

Questions

Status

  • I have read the guidelines in
    CONTRIBUTING.md
  • I have formatted my code using yapf
  • I have tested my code by running pytest
  • I have linted my code with pylint
  • I have added a one-line description of my change to the changelog in
    HISTORY.md
  • This PR is ready to go

@btjanaka btjanaka merged commit 7048c36 into master Sep 7, 2023
17 checks passed
@btjanaka btjanaka deleted the cvt-heatmap-perf branch September 7, 2023 04:46
btjanaka added a commit that referenced this pull request Sep 7, 2023
## Description

<!-- Provide a brief description of the PR's purpose here. -->

The current cvt_archive_heatmap relies on faraway points to draw the
polygons at the edges of the Voronoi diagram. Thus, if a user zooms out,
they will see that the heatmap actually has polygons going beyond it,
like below. In practice, we rarely notice this because the
cvt_archive_heatmap sets the axis limits to be the bounds of the
archive, so the outer regions are hidden.

![cvt
heatmap](https://github.com/icaros-usc/pyribs/assets/38124174/3b3486e0-cf55-4667-ae74-14f3b8a98d63)

However, there may be times when this behavior is undesirable, e.g.,
drawing two cvt archives next to each other. Thus, this PR makes it
possible to clip the bounds of the Voronoi diagram, like so:

![cvt
heatmap](https://github.com/icaros-usc/pyribs/assets/38124174/217ca661-87eb-43d0-b3f9-3b0ffbae9403)

We accomplish this by taking each polygon and computing its intersection
with the archive’s bounding box. Intersections are computed via the
shapely library: https://shapely.readthedocs.io/ Furthermore, by using
shapely, it is possible to clip to arbitrary shapes:

![cvt
heatmap](https://github.com/icaros-usc/pyribs/assets/38124174/1b316db7-8ad1-4943-8db4-8dbc8d6949be)

There are some tradeoffs to turning this feature on by default, namely:

1. It may be more “intuitive” to have infinite regions at the edge of
the archive, as that is how the archive works in practice — namely,
points outside the archive bounds are inserted into cells at the edges.
2. Computing the intersection of all the polygons with the bounding box
(or bounding shape) can be somewhat time-consuming. However, after
running the script in #355 with `clip=True`, I found that the runtime
for 10,000 cells was still ~2.3 sec (much slower than the 0.6 sec in
that PR, but still very fast).

For these reasons, we will keep this feature turned off by default via
the `clip` parameter to `cvt_archive_heatmap` .

Note that this PR introduces a dependency on shapely, and yet the
feature we use shapely for will be turned off by default since
`clip=False`. However, I believe this is acceptable because shapely is a
well-supported library that is also lightweight (~2MB), so it will not
be a burden to users.

## TODO

<!-- Notable points that this PR has either accomplished or will
accomplish. -->

- [x]  Add shapely to deps
- [x]  Add shapely to pinned reqs
- [x]  Add shapely to intersphinx
- [x]  Implement clipping
- [x]  Add flag to turn clipping on and off
- [x]  Add test for heatmap with and without clipping
- [x]  Enable clipping to an arbitrary polygon
- [x] Add test for clipping to arbitrary polygon, with and without holes

## Questions

<!-- Any concerns or points of confusion? -->

## Status

- [x] I have read the guidelines in

[CONTRIBUTING.md](https://github.com/icaros-usc/pyribs/blob/master/CONTRIBUTING.md)
- [x] I have formatted my code using `yapf`
- [x] I have tested my code by running `pytest`
- [x] I have linted my code with `pylint`
- [x] I have added a one-line description of my change to the changelog
in
      `HISTORY.md`
- [x] This PR is ready to go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant