Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add xarray converter #128

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

simmsa
Copy link
Contributor

@simmsa simmsa commented May 1, 2024

Create a standardized method for converting xarray objects into a matlab structure and converting them back into xarray objects.

@simmsa simmsa self-assigned this May 1, 2024
@simmsa
Copy link
Contributor Author

simmsa commented May 1, 2024

@hivanov-nrel, we should consider this a starting point for xarray conversion. The overall goal is provide a way to use xarray objects created by MHKiT in MHKiT-MATLAB.

Overall there seem to be a few approaches:

  1. Completely convert the xarray object into a MATLAB type
  2. Partially convert the xarray object into a MATLAB type
  3. Don't convert the xarray object and write MATLAB functions to access data

I started with approach 1, attempting to convert xarray to a dictionary, then convert that to json, then decode the json into a struct in MATLAB. This approach is valid, but is going to be suboptimal for a few reasons:

  • Type Conversion
    • Numerical types and strings are easily converted
    • Datetimes and other specialized types must be converted
      • For a complete round trip types must be converted to strings and converted to a MATLAB type, and then converted back to the specialized python type
  • Performance
    • This wasn't tested specifically, but serializing large datasets converts them to a character type and converts them back. For small datasets the conversion time is probably reasonable but the problem compounds with larger datasets.
  • Features
    • This completely removes any benefits of using xarray

To implement approach 2 xarray has the to_numpy, and to_dict methods that we could leverage. An idea for a matlab struct would be:

  • data: we can get data but what dimension do we use?
  • attrs: to_dict attrs
  • coords: to_dict coords
  • dims: to_dict dims
  • data_vars: to_dict data_vars
  • xr: python xarray object

The trick with approach 2 is that in some cases we have to alter the data to display it to the MATLAB user. For one dimensional data this is easy, but for multidimensional data we would require user input. This might be confusing for the end user, but it forces them to better understand the underlying data.

Approach 3 could work as well, but may be confusing to MATLAB users. We would probably have to wrap a subset of xarray functionality inside of MATLAB functions. This could expose all of the power of xarray with minimal type conversion.

Approach 2 is presenting itself as a good compromise and probably the preferred path forward, but I'd like to get your feedback.

@simmsa
Copy link
Contributor Author

simmsa commented May 7, 2024

@jmcvey3 directed us to these two functions in MHKiT-Python that perform most of the conversion that we need (Thank you!):

save_mat: https://github.com/MHKiT-Software/MHKiT-Python/blob/2ccc286a65685e5e2f5ab68a467209917f0161d9/mhkit/dolfyn/io/api.py#L240

load_mat: https://github.com/MHKiT-Software/MHKiT-Python/blob/2ccc286a65685e5e2f5ab68a467209917f0161d9/mhkit/dolfyn/io/api.py#L309

We may need to make these functions more generic, and be careful about timestamp conversion. In MHKiT-Python we could break out the xarray -> dict conversion, and the dict -> xarray conversion so we can pass a dictionary into MATLAB which can be converted to a struct and pass back a struct for conversion to xarray.

We should probably test xarray here and include the final functions in mhkit_python_utils. If this modified solution meets our needs we can add it to MHKiT-Python.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant