Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak in raster_band_percentile #316

Open
phargogh opened this issue Apr 6, 2023 · 0 comments
Open

Possible memory leak in raster_band_percentile #316

phargogh opened this issue Apr 6, 2023 · 0 comments
Labels
question Further information is requested

Comments

@phargogh
Copy link
Member

phargogh commented Apr 6, 2023

While working on NCI and addressing some memory issues, I noticed a consistent, gradual increase in memory usage after adding some calls to raster_band_percentile. As you can see from this logging, successive runs to raster_band_percentile are causing an increase in memory usage, even after setting the GDAL max cache. I would have expected that memory usage should stay pretty much the same after re-running the same percentiles function on the same raster.

The percent-memory logging (from psutil) is:

INFO:__main__:%RAM: 0.5465269088745117
INFO:__main__:%RAM: 4.042267799377441
INFO:__main__:%RAM: 4.501819610595703
INFO:__main__:%RAM: 5.270934104919434
Details

INFO:__main__:%RAM: 0.5465269088745117
WARNING:pygeoprocessing.geoprocessing_core:couldn't make working_sort_directory: [Errno 17] File exists: '/Users/jdouglass/workspace/phargogh/pygeoprocessing/percentile-gw8bdf67'
DEBUG:pygeoprocessing.geoprocessing_core:sorting data to heap
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 12.3% complete, 796262400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 24.5% complete, 1592524800 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 28.6% complete, 1857945600 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 32.7% complete, 2123366400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 36.8% complete, 2388787200 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 45.0% complete, 2919628800 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 53.2% complete, 3450470400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 61.4% complete, 3981312000 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 69.6% complete, 4512153600 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 77.7% complete, 5042995200 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 85.9% complete, 5573836800 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 94.1% complete, 6104678400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 1.28% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 4.93% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 8.56% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 12.18% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 15.76% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 19.35% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 22.95% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 26.53% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 30.13% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 33.71% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 37.31% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 40.90% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 44.50% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 48.08% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 51.66% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 55.26% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 58.88% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 62.47% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 66.08% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 69.68% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 73.27% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 76.87% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 80.48% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 84.06% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 87.66% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 91.20% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 94.78% complete
DEBUG:pygeoprocessing.geoprocessing_core:here is percentile_list: [6.0, 172.51925659179688]
INFO:__main__:%RAM: 4.042267799377441
WARNING:pygeoprocessing.geoprocessing_core:couldn't make working_sort_directory: [Errno 17] File exists: '/Users/jdouglass/workspace/phargogh/pygeoprocessing/percentile-gw8bdf67'
DEBUG:pygeoprocessing.geoprocessing_core:sorting data to heap
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 12.3% complete, 796262400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 24.5% complete, 1592524800 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 28.6% complete, 1857945600 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 32.7% complete, 2123366400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 36.8% complete, 2388787200 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 45.0% complete, 2919628800 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 53.2% complete, 3450470400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 61.4% complete, 3981312000 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 69.6% complete, 4512153600 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 77.7% complete, 5042995200 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 85.9% complete, 5573836800 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 94.1% complete, 6104678400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 1.33% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 4.97% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 8.57% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 12.17% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 15.79% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 19.35% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 22.83% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 26.28% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 29.68% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 33.08% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 36.51% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 39.93% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 43.35% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 46.80% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 50.29% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 53.80% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 57.30% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 60.79% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 64.28% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 67.81% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 71.29% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 74.78% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 78.25% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 81.72% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 85.25% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 88.85% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 92.46% complete
DEBUG:pygeoprocessing.geoprocessing_core:here is percentile_list: [6.0, 172.51925659179688]
INFO:__main__:%RAM: 4.501819610595703
WARNING:pygeoprocessing.geoprocessing_core:couldn't make working_sort_directory: [Errno 17] File exists: '/Users/jdouglass/workspace/phargogh/pygeoprocessing/percentile-gw8bdf67'
DEBUG:pygeoprocessing.geoprocessing_core:sorting data to heap
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 12.3% complete, 796262400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 24.5% complete, 1592524800 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 28.6% complete, 1857945600 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 32.7% complete, 2123366400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 36.8% complete, 2388787200 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 40.9% complete, 2654208000 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 49.1% complete, 3185049600 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 57.3% complete, 3715891200 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 61.4% complete, 3981312000 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 69.6% complete, 4512153600 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 77.7% complete, 5042995200 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 85.9% complete, 5573836800 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:data sort to heap 94.1% complete, 6104678400 out of 6486868800
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 0.90% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 4.50% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 8.09% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 11.38% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 14.76% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 18.15% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 21.61% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 25.04% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 28.61% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 32.19% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 35.76% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 39.27% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 42.78% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 46.34% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 49.91% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 53.49% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 57.07% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 60.65% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 64.22% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 67.80% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 71.37% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 74.96% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 78.54% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 82.12% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 85.67% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 89.19% complete
DEBUG:pygeoprocessing.geoprocessing_core:calculating percentiles 92.70% complete
DEBUG:pygeoprocessing.geoprocessing_core:here is percentile_list: [6.0, 172.51925659179688]
INFO:__main__:%RAM: 5.270934104919434

Here's a script to reproduce. The GeoTiff is a large (129600x50053) float32 raster, but any float32 raster should do to reproduce this.

import logging
import os
import tempfile
from collections import Counter

import psutil
import pygeoprocessing
from osgeo import gdal

LOGGER = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG)

raster_path = '/Users/jdouglass/globus-sync/intensified_rainfed_n_app_md5_48687737e6fdf931ddb163c6c9694e44.tif'
working_dir = tempfile.mkdtemp(dir=os.getcwd(), prefix='percentile-')

gdal.SetCacheMax(2**30)  # Make sure we know the cache limit.

LOGGER.info(f'%RAM: {psutil.Process(os.getpid()).memory_percent()}')
percentiles = pygeoprocessing.raster_band_percentile(
    (raster_path, 1), working_dir, [5, 95])
LOGGER.info(f'%RAM: {psutil.Process(os.getpid()).memory_percent()}')

percentiles = pygeoprocessing.raster_band_percentile(
    (raster_path, 1), working_dir, [5, 95])
LOGGER.info(f'%RAM: {psutil.Process(os.getpid()).memory_percent()}')

percentiles = pygeoprocessing.raster_band_percentile(
    (raster_path, 1), working_dir, [5, 95])
LOGGER.info(f'%RAM: {psutil.Process(os.getpid()).memory_percent()}')
@phargogh phargogh added the bug Something isn't working label Apr 6, 2023
@phargogh phargogh added question Further information is requested and removed bug Something isn't working labels Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant