Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Point In Face #1056

Open
wants to merge 62 commits into
base: main
Choose a base branch
from
Open

Point In Face #1056

wants to merge 62 commits into from

Conversation

aaronzedwick
Copy link
Member

@aaronzedwick aaronzedwick commented Nov 5, 2024

Closes #905

Overview

Expected Usage

from uxarray.grid.geometry import point_in_polygon

# Defined polygon
polygon = [ [-10,  10, -10, 10], [10, 10, -10, -10]]

# Point to check
point = [10, 10]

point_in_polygon(polygon, point, inclusive=True)

PR Checklist

General

  • An issue is linked created and linked
  • Add appropriate labels
  • Filled out Overview and Expected Usage (if applicable) sections

Testing

  • Adequate tests are created if there is new functionality
  • Tests cover all possible logical paths in your function
  • Tests are not too basic (such as simply calling a function and nothing else)

Documentation

  • Docstrings have been added to all new functions
  • Docstrings have updated with any function changes
  • Internal functions have a preceding underscore (_) and have been added to docs/internal_api/index.rst
  • User functions have been added to docs/user_api/index.rst

@aaronzedwick
Copy link
Member Author

@ philipc2 do you think this should be an internal function or exposed to the user?

@aaronzedwick aaronzedwick changed the title DRAFT: Point In Polygon Point In Polygon Nov 27, 2024
@aaronzedwick aaronzedwick marked this pull request as ready for review November 27, 2024 15:13
Copy link
Member

@philipc2 philipc2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include an ASV benchmark. I'd suggest doing a parameterized benchmark for the 120 and 480 km MPAS grids.

Copy link
Member

@philipc2 philipc2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the optimized functions from #1072 and try to write the function entirely in Numba. This may require us to pass in both the cartesian and spherical versions of point & polygon. Let me know if you have any questions!

uxarray/grid/geometry.py Outdated Show resolved Hide resolved
@aaronzedwick aaronzedwick added the run-benchmark Run ASV benchmark workflow label Jan 14, 2025
Copy link

github-actions bot commented Jan 14, 2025

ASV Benchmarking

Benchmark Comparison Results

Benchmarks that have improved:

Change Before [62ca314] After [ba507d6] Ratio Benchmark (Parameter)
- 448M 403M 0.9 face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc'))
- 638M 394M 0.62 face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc'))
- 482M 386M 0.8 mpas_ocean.Integrate.peakmem_integrate('480km')

Benchmarks that have stayed the same:

Change Before [62ca314] After [ba507d6] Ratio Benchmark (Parameter)
400M 400M 1.00 face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc'))
432M 430M 1.00 face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc'))
18.5±0.1ms 18.7±0.1ms 1.01 face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc'))
7.45±0.09ms 7.50±0.1ms 1.01 face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc'))
43.6±1ms 42.9±0.2ms 0.98 face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc'))
3.96±0.1ms 3.86±0.09ms 0.97 face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc'))
2.88±0.01s 2.89±0.01s 1.00 import.Imports.timeraw_import_uxarray
684±5μs 703±8μs 1.03 mpas_ocean.CheckNorm.time_check_norm('120km')
454±1μs 451±4μs 0.99 mpas_ocean.CheckNorm.time_check_norm('480km')
638±4ms 658±10ms 1.03 mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('120km')
41.8±0.4ms 42.9±0.6ms 1.02 mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('480km')
1.74±0.01ms 1.76±0.03ms 1.01 mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('120km')
610±20μs 596±10μs 0.98 mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('480km')
5.90±0.05ms 5.96±0.04ms 1.01 mpas_ocean.ConstructFaceLatLon.time_cartesian_averaging('120km')
3.53±0.08ms 3.57±0.06ms 1.01 mpas_ocean.ConstructFaceLatLon.time_cartesian_averaging('480km')
3.54±0.01s 3.58±0.01s 1.01 mpas_ocean.ConstructFaceLatLon.time_welzl('120km')
223±3ms 225±1ms 1.01 mpas_ocean.ConstructFaceLatLon.time_welzl('480km')
1.30±0μs 1.20±0.01μs 0.92 mpas_ocean.ConstructTreeStructures.time_ball_tree('120km')
314±1ns 301±5ns 0.96 mpas_ocean.ConstructTreeStructures.time_ball_tree('480km')
795±9ns 801±6ns 1.01 mpas_ocean.ConstructTreeStructures.time_kd_tree('120km')
296±3ns 280±3ns 0.94 mpas_ocean.ConstructTreeStructures.time_kd_tree('480km')
428±5ms 433±4ms 1.01 mpas_ocean.CrossSection.time_const_lat('120km', 1)
220±1ms 219±2ms 1.00 mpas_ocean.CrossSection.time_const_lat('120km', 2)
114±1ms 112±0.7ms 0.99 mpas_ocean.CrossSection.time_const_lat('120km', 4)
356±1ms 359±0.9ms 1.01 mpas_ocean.CrossSection.time_const_lat('480km', 1)
179±3ms 183±0.7ms 1.02 mpas_ocean.CrossSection.time_const_lat('480km', 2)
92.5±0.5ms 94.3±1ms 1.02 mpas_ocean.CrossSection.time_const_lat('480km', 4)
123±2ms 123±1ms 1.00 mpas_ocean.DualMesh.time_dual_mesh_construction('120km')
8.68±0.05ms 8.33±0.1ms 0.96 mpas_ocean.DualMesh.time_dual_mesh_construction('480km')
1.10±0.01s 1.08±0.01s 0.98 mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', False)
54.0±0.4ms 54.2±0.6ms 1.00 mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', True)
88.1±0.7ms 85.5±0.1ms 0.97 mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', False)
5.46±0.2ms 5.13±0.08ms 0.94 mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', True)
339M 339M 1.00 mpas_ocean.Gradient.peakmem_gradient('120km')
316M 316M 1.00 mpas_ocean.Gradient.peakmem_gradient('480km')
2.74±0.01ms 2.73±0.03ms 1.00 mpas_ocean.Gradient.time_gradient('120km')
314±1μs 319±4μs 1.02 mpas_ocean.Gradient.time_gradient('480km')
247±10μs 239±8μs 0.97 mpas_ocean.HoleEdgeIndices.time_construct_hole_edge_indices('120km')
119±0.9μs 117±0.4μs 0.98 mpas_ocean.HoleEdgeIndices.time_construct_hole_edge_indices('480km')
402M 402M 1.00 mpas_ocean.Integrate.peakmem_integrate('120km')
172±1ms 176±1ms 1.02 mpas_ocean.Integrate.time_integrate('120km')
11.6±0.06ms 11.5±0.03ms 0.99 mpas_ocean.Integrate.time_integrate('480km')
345±4ms 348±2ms 1.01 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'exclude')
347±2ms 350±2ms 1.01 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'include')
347±2ms 351±3ms 1.01 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'split')
22.8±0.3ms 23.3±0.1ms 1.02 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'exclude')
23.0±0.1ms 23.3±0.1ms 1.01 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'include')
22.8±0.1ms 23.5±0.2ms 1.03 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'split')
failed failed n/a mpas_ocean.PointInPolygon.time_whole_grid('120km')
failed failed n/a mpas_ocean.PointInPolygon.time_whole_grid('480km')
55.0±0.1ms 55.3±0.1ms 1.01 mpas_ocean.RemapDownsample.time_inverse_distance_weighted_remapping
44.7±0.2ms 44.9±0.1ms 1.00 mpas_ocean.RemapDownsample.time_nearest_neighbor_remapping
357±1ms 356±0.9ms 1.00 mpas_ocean.RemapUpsample.time_inverse_distance_weighted_remapping
260±0.7ms 261±0.6ms 1.00 mpas_ocean.RemapUpsample.time_nearest_neighbor_remapping
314M 312M 0.99 quad_hexagon.QuadHexagon.peakmem_open_dataset
312M 311M 1.00 quad_hexagon.QuadHexagon.peakmem_open_grid
6.31±0.06ms 6.28±0.02ms 0.99 quad_hexagon.QuadHexagon.time_open_dataset
5.36±0.02ms 5.38±0.2ms 1.00 quad_hexagon.QuadHexagon.time_open_grid

@aaronzedwick aaronzedwick added run-benchmark Run ASV benchmark workflow and removed run-benchmark Run ASV benchmark workflow labels Jan 14, 2025
Comment on lines 2443 to 2451
# Get the face's edges for the whole subset
face_edge_cartesian = _get_cartesian_face_edge_nodes(
subset.face_node_connectivity.values,
subset.n_face,
subset.n_max_face_nodes,
subset.node_x.values,
subset.node_y.values,
subset.node_z.values,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very expensive to re-compute each time we call this function. Consider adding an internal property or storing this in Grid._ds like we do with our other methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that, but it is a new subset each time the operation is run. So won't it be recomputed anyway? It should only be a face or a few at max for each subset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more efficient approach would be to store it in the original Grid, where it can be computed once, and then once we subset, it should return a Grid with the face_edge_nodes_xyx appropriately indexed.

Though, if it is only a few faces at most, the perfromance may be negligible, but we need to consider the cases where we may be running millions of point in polygon queries for higher-resolution grids.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, okay yeah that would work. I didn't realize it would keep it through the subset operation.

Copy link
Member Author

@aaronzedwick aaronzedwick Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subset actually keeps the entire array for the whole grid. I can still make it work, that way, but that is why the tests where failing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the subset actually keeps the entire array for the whole grid. I can still make it work, that way, but that is why the tests where failing.

Yeah, we should always ensure that the subset never contains the entire array for any variable.

uxarray/grid/geometry.py Outdated Show resolved Hide resolved
Comment on lines 188 to 192
class PointInPolygon(GridBenchmark):
def time_whole_grid(self, resolution):
point_xyz = np.array([self.uxgrid.face_x[0].values, self.uxgrid.face_y[0].values, self.uxgrid.face_z[0].values])

self.uxgrid.get_faces_containing_point(point_xyz=point_xyz)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the GH Actions report, it appears that a single Point in Face query is taking approximately 27.0±0.1ms

Ideally, we want to get this number significantly down.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some digging:

Please include the following in a custom setup function:

_ = uxds.uxgrid.face_edge_connectivity

We don't want this to be included in the timing. I believe much of the 27ms is attributed to that.

This is for a 30km Grid

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, consider including a single sample point query in the setup, since there is also some overhead with the KD tree construction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the performance I get for a 30km grid after doing the things mentioned above. pretty good for 30km grids

image

Only issue, is that it doesn't find any faces.

Here is my sample code I used.

import uxarray as ux
import numpy as np

import cProfile

profiler = cProfile.Profile()

grid_path = "/Users/philipc/PycharmProjects/uxarray/unstructured-grid-viz-cookbook/meshfiles/x1.655362.grid.nc"
data_path = "/Users/philipc/PycharmProjects/uxarray/unstructured-grid-viz-cookbook/meshfiles/x1.655362.data.nc"

uxds = ux.open_dataset(grid_path, data_path)
uxds.uxgrid.normalize_cartesian_coordinates()

_ = uxds.uxgrid.face_edge_connectivity

point = np.array([0.0, 0.0, 1.0])
res = uxds.uxgrid.get_faces_containing_point(point)


profiler.enable()
res = uxds.uxgrid.get_faces_containing_point(point)
profiler.disable()

profiler.dump_stats('pface.prof')

print(res)

Can do snakeviz pface.prof to view the profiler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this for me! I am trying to implement these changes now. In terms of it not finding any faces, does the mesh have holes in it? There may not be a face at the pole possibly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The grid is a global MPAS atmosphere grid with no holes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have been on an older version of the code. Just ran it again after pulling and it works. A single face is detected. Going to run a few more tests.

Copy link
Member Author

@aaronzedwick aaronzedwick Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it working locally for you? It does for me, but the CI is failing. Not sure why.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it is working for the poles at least on my end. Haven't tested other points

@aaronzedwick aaronzedwick added run-benchmark Run ASV benchmark workflow and removed run-benchmark Run ASV benchmark workflow labels Jan 14, 2025
@aaronzedwick aaronzedwick added run-benchmark Run ASV benchmark workflow and removed run-benchmark Run ASV benchmark workflow labels Jan 15, 2025
@rajeeja
Copy link
Contributor

rajeeja commented Jan 15, 2025

Try to print array shape and check why is it assesing this?

works on my local also, it must be something with dependency or library versions.

@philipc2 philipc2 removed the run-benchmark Run ASV benchmark workflow label Jan 15, 2025
@aaronzedwick aaronzedwick added the run-benchmark Run ASV benchmark workflow label Jan 15, 2025
Copy link
Member

@philipc2 philipc2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the following to our API
Grid.max_face_radius
Grid.get_faces_containing_point

Lets not include Grid.face_edge_nodes_xyz in the API for now.

@@ -2415,3 +2443,26 @@ def get_faces_at_constant_longitude(self, lon: float):

faces = constant_lon_intersections_face_bounds(lon, self.face_bounds_lon.values)
return faces

def get_faces_containing_point(self, point_xyz):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should change point_xyz to point, which can be either cartesian or spherical. Many times the user will want to query a spherical point, which we can convert to x, y, z.

@aaronzedwick aaronzedwick marked this pull request as ready for review January 16, 2025 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-benchmark Run ASV benchmark workflow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Point in Face
4 participants