-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dolfyn cleaning function updates #354
base: develop
Are you sure you want to change the base?
Conversation
@jmcvey3 is this ready for review? |
It is. I've also done some serious reorganizing of the I/O files in an attempt to make them easier to follow, and have added a few random fixes for random hiccups. The fix for 535 is in the first couple commits. |
Thanks James. I plan to tackle this Monday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmcvey3 thanks for fixing the bug and improving the modularity of the dolfyn module.
I was adding docstrings to what I thought were new functions but realized they were primarily moved functions in our meeting yesterday. As such I will finish improving docstrings when we get to linting this module in the future.
That said could you review the proposed docstrings I already did?
mhkit/dolfyn/adp/clean.py
Outdated
""" | ||
Fetches or calculates range offset from dataset attributes. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""
Determines the range offset from the dataset attributes. The function
first checks if specific attributes are present in the dataset (`ds`)
and calculates the range offset accordingly. If the attribute `h_deploy`
is found, it is renamed to `range_offset` with a deprecation warning.
Parameters
----------
ds : xarray.Dataset
Dataset containing attributes used to calculate the range offset.
Returns
-------
float
The calculated or retrieved range offset. Returns 0 if no
relevant attributes are present.
Raises
------
DeprecationWarning
Warns that the attribute `h_deploy` is deprecated and has been
renamed to `range_offset`.
"""
mhkit/dolfyn/adp/clean.py
Outdated
|
||
|
||
def set_range_offset(ds, h_deploy): | ||
def __check_for_range_offset(ds): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we haven't done typehints here yet but we will eventually. Wouldn't hurt to start with new functions now.
def __check_for_range_offset(ds: "xarray.Dataset") -> float:
mhkit/dolfyn/adp/clean.py
Outdated
def find_surface(*args, **kwargs): | ||
warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets include a docstring to catch the deprecation in our documentation/ display in IDEs.
"""
Deprecated function. Use `water_depth_from_amplitude` instead.
"""
mhkit/dolfyn/adp/clean.py
Outdated
def find_surface_from_P(*args, **kwargs): | ||
warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include docstring deprecation for documentation/ IDEs
"""
Deprecated function. Use `water_depth_from_pressure` instead.
"""
mhkit/dolfyn/adp/clean.py
Outdated
def nan_beyond_surface(*args, **kwargs): | ||
warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include deprecation docstring
Deprecated function. Use `remove_surface_interference` instead.
mhkit/dolfyn/io/rdi.py
Outdated
@@ -1352,14 +732,16 @@ def check_offset(self, offset, readbytes): | |||
fd.seek(4 + self._fixoffset, 1) | |||
|
|||
def remove_end(self, iens): | |||
"""Remove uncompleted measurements""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""
Removes incomplete measurements from the dataset.
This method cleans up any partially read data by truncating measurements
to the specified ensemble index (`iens`). This is typically called upon
reaching the end of the file to ensure only complete data is retained.
Parameters
----------
iens : int
The index up to which data is considered complete and should be retained.
"""
mhkit/dolfyn/io/rdi.py
Outdated
|
||
def save_profiles(self, dat, nm, en, iens): | ||
ds = defs._get(dat, nm) | ||
"""Reformat and save profile measurements""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""
Reformats and saves profile measurements to the dataset.
This method processes and saves profile data from individual measurements,
adapting to changing cell counts and cell sizes as needed. It appends
the processed profile data to the specified location in the dataset.
Parameters
----------
dat : dict
The dataset dictionary where profile measurements are stored.
nm : str
The name of the measurement variable being saved.
en : dict
A dictionary containing ensemble data with measurement arrays for each variable.
iens : int
The index of the current ensemble, indicating where to save data in the profile.
Returns
-------
dict
The updated dataset dictionary with the reformatted and saved profile data.
"""
mhkit/dolfyn/io/rdi.py
Outdated
"""Clean up recorded data, particularly variablem cell sizes | ||
and profile ranges.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""
Cleans up recorded data by adjusting variable cell sizes and profile ranges.
This method handles adjustments when cell sizes change during data collection,
performing depth-bin averaging for smaller cells if needed. It also updates
the configuration data, range coordinates, and manages any surface layer profiles.
Parameters
----------
dat : dict
The dataset dictionary containing data variables and coordinates to be cleaned up.
cfg : dict
Configuration dictionary, which is updated with cell size, range, and additional
attributes after cleanup.
Returns
-------
tuple
- dict : The updated dataset dictionary with cleaned data.
- dict : The updated configuration dictionary with new attributes.
"""
mhkit/dolfyn/io/rdi.py
Outdated
""" | ||
Reshape the array `arr` to shape (...,n,n_bin). | ||
""" | ||
"""Reshape the array `arr` to shape (...,n,n_bin).""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""
Reshapes the input array `arr` to a shape of (..., n, n_bin).
This method reshapes the last dimension of `arr` into two dimensions, where
the last two dimensions become `(n, n_bin)`.
Parameters
----------
arr : np.ndarray
The array to reshape. The last dimension of `arr` will be divided into
`(n, n_bin)` based on the value of `n_bin`.
n_bin : int or float, optional
The bin size for reshaping. If `n_bin` is an integer, it divides the
last dimension directly. If not, indices are adjusted, and excess
bins may be removed. Default is None.
Returns
-------
np.ndarray
The reshaped array with shape (..., n, n_bin).
"""
mhkit/dolfyn/io/rdi.py
Outdated
""" | ||
Remove the attributes from the data that were never loaded. | ||
""" | ||
"""Remove the attributes from the data that were never loaded.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""
This method cleans up the dataset by removing any attributes that were
defined but not loaded, updates configuration attributes, and sets the
sampling frequency (fs
) based on the data source program. Additionally,
it adjusts the axes of certain variables as defined in data_defs
.
Parameters
----------
dat : dict
The dataset dictionary to be finalized. This dictionary is modified
in place by removing unused attributes, setting configuration values
as attributes, and calculating `fs`.
Returns
-------
dict
The finalized dataset dictionary with cleaned attributes and added metadata.
"""
Yep, and sounds like a plan |
Renames and makes the functions that calculate water depth and remove surface interference more robust. The functions were designed around Nortek ADCPs but didn't take into account the calculations RDI already does. Errors or warnings will guide the user through the proper use of functions if previous calculations were already completed or data is missing.
Also removes the "ignore rankwarning" as some users may not want to do so, and updates future package deprecations.
Solves the problems Calum and I found in Issue #353. Tests were updated as appropriate.
Also added some fixes for determining beam angle and carrier frequency attributes for Sentinel V ADCPs.