Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: tooling for reading and reprojecting raster files #123

Open
robbibt opened this issue Jan 30, 2024 · 0 comments
Open

Feature request: tooling for reading and reprojecting raster files #123

robbibt opened this issue Jan 30, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@robbibt
Copy link
Contributor

robbibt commented Jan 30, 2024

A common use case is wanting to load a raster file from disk or S3, and reproject it into a pixel grid/GeoBox that precisely matches another dataset (e.g. to perform a validation or combine multiple datasets). Currently, this can be achieved in at least two ways:

  • Datacube's from datacube.testutils.io import rio_slurp_xarray function
  • Indirectly via a combination of a combination of rioxarray and odc-geo

It would be extremely useful to have this functionality built into odc-geo, as it would greatly simplify the process of analysing datasets from different sources (e.g. datacube, odc-stac and local/cloud-hosted raster files).

Example code demonstrating this use case:

import rioxarray
import xarray as xr
import numpy as np
import odc.geo.xr


def load_reproject(
    path,
    how,
    resolution="auto",
    tight=False,
    resampling="nearest",
    chunks={"x": 2048, "y": 2048},
    bands=None,
    masked=True,
    reproject_kwds=None,
    **kwargs
):
    """
    Load and reproject part of a raster dataset into a given GeoBox or
    custom CRS/resolution.

    Parameters
    ----------
    path : str
        Path to the raster dataset to be loaded and reprojected.
    how : GeoBox, str or int
        How to reproject the raster. Can be a GeoBox or a CRS (e.g.
        "ESPG:XXXX" string or integer).
    resolution : str or int, optional
        The resolution to reproject the raster dataset into if `how` is
        a CRS, by default "auto". Supports:
            - "same" use exactly the same resolution as the input raster
            - "fit" use center pixel to determine required scale change
            - "auto" uses the same resolution on the output if CRS units
             are the same between source and destination; otherwise "fit"
            - Else, a specific resolution in the units of the output crs
    tight : bool, optional
         By default output pixel grid is adjusted to align pixel edges
         to X/Y axis, suppling tight=True produces an unaligned geobox.
    resampling : str, optional
        Resampling method to use when reprojecting data, by default
        "nearest", supports all standard GDAL options ("average", 
        "bilinear", "min", "max", "cubic" etc).
    chunks : dict, optional
        The size of the Dask chunks to load the data with, by default
        {"x": 2048, "y": 2048}.
    bands : str or list, optional
        Bands to optionally filter to when loading data.
    masked : bool, optional
        Whether to mask the data by its nodata value, by default True.
    reproject_kwds : dict, optional
        Additional keyword arguments to pass to the `.odc.reproject()`
        method, by default None.
    **kwargs : dict
        Additional keyword arguments to be passed to the
        `rioxarray.open_rasterio` function.

    Returns
    -------
    xarray.Dataset
        The reprojected raster dataset.
    """
    # Use empty kwds if not provided
    reproject_kwds = {} if reproject_kwds is None else reproject_kwds

    # Load data with rasterio
    da = rioxarray.open_rasterio(
        filename=path,
        masked=masked,
        chunks=chunks,
        **kwargs,
    )

    # Optionally filter to bands
    if bands is not None:
        da = da.sel(band=bands)

    # Reproject into GeoBox
    da = da.odc.reproject(
        how=how,
        resolution=resolution,
        tight=tight,
        resampling=resampling,
        dst_nodata=np.NaN if masked else None,
        **reproject_kwds,
    )

    return da
@robbibt robbibt added the enhancement New feature or request label Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant