Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify thread-safety of CRS caching approach #5

Open
Kirill888 opened this issue Jan 18, 2022 · 2 comments
Open

Verify thread-safety of CRS caching approach #5

Kirill888 opened this issue Jan 18, 2022 · 2 comments

Comments

@Kirill888
Copy link
Member

We are caching pyproj.CRS objects here:

odc-geo/odc/geo/_crs.py

Lines 31 to 32 in 202cd8a

@cachetools.cached(_crs_cache, key=_make_crs_key)
def _make_crs(crs: Union[str, _CRS]) -> Tuple[_CRS, str, Optional[int]]:

And pyproj transformers here:

odc-geo/odc/geo/_crs.py

Lines 47 to 48 in 202cd8a

@cachetools.cached({}, key=_make_crs_transform_key)
def _make_crs_transform(from_crs, to_crs, always_xy):

  • What that means for multi-threaded access?
  • Should we use lock when populating cache?
  • Should we use thread local cache instead of locking?
  • What are the constraints of multi-threaded access in pyproj/PROJ itself
    • currently we are assuming that sharing CRS objects across threads is fine?
  • We should also add purging rules to those caches, or at least expose manual purge option.
  • We should understand cost of caching in terms of RAM, especially for transformers cache.
@Kirill888
Copy link
Member Author

Relevant comment from datacube repo
opendatacube/datacube-core#1230 (comment)

@Kirill888
Copy link
Member Author

Looks like starting from pyproj>=3.1+ crs and transformers are made thread-safe internally in pyproj by delegating to lazily constructed thread-local C++ objects.
pyproj4/pyproj#793

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant