I/O support for the ndx-pose NWB extension #166

edeno · 2024-04-19T15:13:21Z

Description

What is this PR

Bug fix
Addition of a new feature
Other

Why is this PR needed?
NWB is a standardized format for neuroscience data. As one of the goals of this package is to standardize output from pose estimation algorithms, it would be useful to output to NWB. In addition, importing from the ndx-pose extension would be useful as future algorithms could simply output to this format and then import into the common movement format.

What does this PR do?
Exports skeletons and PoseEstimation objects from the ndx-pose extension.

References

Fix #23

How has this PR been tested?

Added export example in examples.

Is this a breaking change?

No.

Does this PR require an update to the documentation?

Have not updated documentation yet.

Checklist:

The code has been tested locally
Tests have been added to cover all new functionality
The documentation has been updated to reflect any changes
The code has been formatted with pre-commit

To-Do

Import from nwb file to movement format
Add kinematics

Notes:

The ndx-pose package on pypi is not updated yet (@rly). The current code utilizes a pip install directly from https://github.com/rly/ndx-pose/ however the pre-commit hooks would not allow me to add a pip dependency for installing directly from the github repo.

The current export code does not include anything about the training of the models.

codecov · 2024-04-19T15:22:09Z

Codecov Report

Attention: Patch coverage is 0% with 57 lines in your changes missing coverage. Please review.

Project coverage is 91.08%. Comparing base (cf0a6b1) to head (739c4d8).

❗ Current head 739c4d8 differs from pull request most recent head 90c3287

Please upload reports for the commit 90c3287 to get more accurate results.

Files	Patch %	Lines
movement/io/nwb.py	0.00%	57 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #166      +/-   ##
==========================================
- Coverage   99.68%   91.08%   -8.60%     
==========================================
  Files          11       11              
  Lines         639      662      +23     
==========================================
- Hits          637      603      -34     
- Misses          2       59      +57

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

niksirbi · 2024-04-19T15:32:03Z

Thanks a lot for this @edeno 🤗 .
We've been wanting to integrate with .nwb for a while, this is definitely on our roadmap. We'll take a look at the PR next week and get back to you with comments.

niksirbi · 2024-04-22T13:24:16Z

Just a heads up that we've just merged a PR introducing some new linting rules for docstrings. So it's probably a good idea to update from the main branch (via rebase or a merge commit), to avoid any merge conflicts.

for more information, see https://pre-commit.ci

sonarcloud · 2024-04-22T13:29:47Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

niksirbi

Thanks a lot for this @edeno, I've had a look at the code and it looks nicely done.

I've left many comments at particularly places in the code, but most of them are small pedantic things (aimed at ensuring consistency in formatting and language throughout the project).

On a higher level, I have two substantial comments:

Tests needed

The main thing to be done is testing. The code appears to do what it claims to do, and some of that functionality is verifiable through the example you added. That said, we want to be absolutely sure that the conversions from NWB to movement and back are properly done. We could take a similar approach to how we test other file types.

Specifically, we could host a valid ndx-pose .nwb file as part of our sample data and use it for the tests. At minimum, what I'd like to see is the following: load the nwb file, convert it to a movement dataset, validate the movement dataset (we have some existing functionality for that), then convert it back to ndx-pose and check that we end up with the same thing we started with. You can check the existing IO integration tests for inspiration. We also have more fine-grained unit tests for loading and saving files.

If implementing the tests looks too daunting or you can't spare the time, I'm happy to help with writing this. It would be certainly helpful if you could point me to a valid ndx-pose file that we can publicly share with our sample data. Also, since I'm not very familiar with the NWB ecosystem, do you happen to know of an nwb/ndx-pose validator tool we can use you programmatically check that the movement-exported nwb files are valid?

User-facing interface for loading/saving NWB files

For context, what follows is my vision for what an ideal movement-nwb integration could look like. This doesn't need to be implemented in the current PR. I'm happy to merge this PR after the tests are added (and ndx-pose 0.2 is released on PyPI), and we can refactor the interface in a future PR.

One of our major aspirations is to present a unified interface for loading/saving files containing pose data, as agnostic to the file format as possible. Of course this is easier said than done. Currently, the interface for loading/saving files works like this:

from movement.io import load_poses, save_poses

ds = load_poses.from_dlc_file(file_path)  # DLC
ds = load_poses.from_sleap_file(file_path)  # SLEAP
ds = load_poses.from_lp_file(file_path) # LigthningPose

# the consistency allows as to wrap the above in a more generic function
ds = load_poses.from_file(file_path, source_software)

# the interface for saving is similar, e.g.
save_poses.to_dlc_file(ds, file_path)

I'm keen to maintain the consistency as much as possible/practical. Ideally this would mean exposing the following functions:

from movement.io import load_poses, save_poses

ds = load_poses.from_nwb_file(file_path)
save_poses.to_nwb_file(ds, file_path)

The actual under-the hood machinery can stay in the nwb.py module you've introduced, but what the user "sees" should be similar to the existing loading/saving functions.

Reading your implementation, I realised that this may not be that easy for ndx-pose .nwb files, because (as far as I can tell), there is strictly one subject per file. In that case, we have to do one of the following:

accept a list of more than one file paths in the load_poses.from_nwb_file function, this is sort of the route you've chosen.
still accept only 1 file, same as the other loading functions, and provide a separate function for merging individuals into a common dataset (after loading). This could be as easy as xr.merge, which we can showcase with an example.

The save_poses.to_nwb_file() is more straightforward to implement I think, because we can always choose to split a multi-subject dataset into multiple .nwb file and add the subject's ID to the file suffix. We do something similar for the save_poses.to_dlc_file() function when split_individuals=True.

Let me know your take on these ideas about the interface, I'm keen to have feedback on the API.

movement/io/nwb.py

pyproject.toml

movement/io/nwb.py

niksirbi · 2024-04-25T10:46:43Z

movement/io/nwb.py

+    )
+
+
+def convert_nwb_to_movement(nwb_filepaths: list[str]) -> xr.Dataset:


Similarly for this function, it's public, so a full docstring is needed.

There seems to be an assymetry between the two converter functions:
convert_nwb_to_movement reads nwb files from disk, while convert_movement_to_nwb writes to the NWB file handlers, without actually writing to disk.

To be consistent with our other loading/saving functions, I'd prefer to have functions to read from nwb file(s) on disk to movement ds, and functions for doing the exact opposite, i.e. write from movement ds to nwb file(s) on disk.

Also, in our other IO functions we accept both str and pathlib.Path as file paths.
In this particular fuction it would make sense to accept either a single path (Union[Path, str]) (which would work for single-subject datasets), or a list of such paths.

You could alro re-use our movement.io.validators.ValidFile validator for checking the paths, either directly, or via the _validate_file_path utility that can be found in movement.io.save_poses

I think my reasoning for not writing from movement dataset to NWB file on disk is because there are often times where you'd like to add other things to the NWB file before writing to disk.

examples/nwb_conversion.py

movement/io/nwb.py

niksirbi · 2024-05-07T20:12:48Z

Hey @edeno, a recently merged PR introduced some breaking changes that affect the sample data and tests. You'd have to update (again) from the main branch (via rebase or a merge commit), for the CI to work and to avoid any merge conflicts. Sorry about that.

Also, do you need any help with making progress on this PR? Don't hesitate to poke me if you do.

edeno · 2024-05-07T20:37:31Z

Hi @niksirbi, that's fine. Happy to update.

Also thank you for your review. I haven't had time to work on this recently but I was intending to pick it up again when I have some more free time.

niksirbi · 2024-05-08T07:58:31Z

Hi @niksirbi, that's fine. Happy to update.

Also thank you for your review. I haven't had time to work on this recently but I was intending to pick it up again when I have some more free time.

No worries, this is not urgent. It's more important to get it right than to do it fast. Let me know when you get back to it (or if at any point you realise you can't find the time).

niksirbi · 2024-05-21T09:33:41Z

I've written a "developer guide" for adding new Input/Output functionalities to movement, partly motivated by the ongoing work in this PR. I hope that it will help clarify the relevant code structure and make it easier to find things for contributors such as yourself @edeno. Have a read when you get back to working on this PR and let me know if anything is unclear.

for more information, see https://pre-commit.ci

Co-authored-by: Niko Sirmpilatze <[email protected]>

for more information, see https://pre-commit.ci

Co-authored-by: Niko Sirmpilatze <[email protected]>

sonarcloud · 2024-10-25T14:06:01Z

Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

niksirbi · 2024-10-25T14:21:24Z

Hi @niksirbi, I am unfortunately somewhat limited in my time in the next month or so. Happy to have you take it over and I can fill in details where necessary.

That's completely fine, you've already done most of the work. I'll just have to go over it again and make sure it's consistent with the rest of our codebase. I'll ping you if I have concrete questions.

niksirbi · 2024-11-08T15:59:02Z

See also https://github.com/talmolab/sleap-io/pull/143/files (linking for my own reference).

luiztauffer · 2024-11-12T07:57:29Z

hey guys, this looks great and very useful! Any chance this could be merged soon? I'm interested in using movement for a couple of projects, and NWB integration is important.

I'd like to give my two cents on two points:

Many times the name attribute of an nwb object has a value different from the default, that is, not every ndx_pose.PoseEstimation will have the name PoseEstimation, and in those cases the line 276 will give an error. An easy way around that would be an optional argument:

def convert_nwb_to_movement(
    nwb_filepaths: str | list[str] | list[Path],
    key_name: str = "PoseEstimation",
) -> xr.Dataset:
   ...
            pose_estimation = nwbfile.processing["behavior"][key_name]

I agree with @edeno point that many times you want to add other things to an nwb file object before writing it to disk, so it is a common practice to separate functions that add to the object from functions that write to disk. You could have both, I suppose

niksirbi · 2024-11-12T09:34:52Z

Thank a lot for your input @luiztauffer.

This PR is on my priority list, and I expect to complete it within the next ~2 weeks.

Regarding the two points you made, I agree on both counts, and I will make sure to also include functions to read/write NWB file objects (not just from/to disk).

I'd value your feedback on this feature, when its ready. Don't hesitate to mention any other requirements that come to mind and to file bug reports if/when necessary.

luiztauffer · 2024-11-12T09:39:35Z

great, @niksirbi! Let me know if I could help in any way

edeno added 7 commits April 18, 2024 09:54

Create nwb_export.py

bd12d04

NWB requires one file per individual

c5319b9

Add script

d82fe30

Remove import error handling

d889105

Add nwb optional dependencies

72aea47

Fix linting based on pre-commit hooks

12bf83f

Add example docstring

742bf86

edeno added 2 commits April 19, 2024 13:31

Rename to fit module naming pattern

a06d485

Add import from nwb

739c4d8

edeno and others added 2 commits April 22, 2024 06:28

Merge branch 'main' into main

aef9b0c

[pre-commit.ci] auto fixes from pre-commit.com hooks

ce28f90

for more information, see https://pre-commit.ci

niksirbi reviewed Apr 25, 2024

View reviewed changes

edeno and others added 10 commits June 8, 2024 09:01

Merge remote-tracking branch 'upstream/main'

b9599a9

[pre-commit.ci] auto fixes from pre-commit.com hooks

2491cf6

for more information, see https://pre-commit.ci

Apply suggestions from code review

58bef93

Co-authored-by: Niko Sirmpilatze <[email protected]>

Update make pynwb and ndx-pose core dependencies

3ab9aa4

Cleanup of docstrings and variable names from code review

910ce90

Rename function for clarity

3f9a53b

Update with example converting back to movement

3cd991d

Add file validation and handling for single path

3aa1b11

Add preliminary tests

e56cf6d

Convert to numpy array

99a90c1

edeno and others added 21 commits October 25, 2024 07:04

NWB requires one file per individual

3ccd71c

Add script

f906cd5

Remove import error handling

d35d9c2

Add nwb optional dependencies

e5726d4

Fix linting based on pre-commit hooks

53f505b

Add example docstring

f1d480d

Rename to fit module naming pattern

4b162cf

Add import from nwb

4b887be

[pre-commit.ci] auto fixes from pre-commit.com hooks

96ee7ba

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

2f2625d

for more information, see https://pre-commit.ci

Apply suggestions from code review

1c7c2e3

Co-authored-by: Niko Sirmpilatze <[email protected]>

Update make pynwb and ndx-pose core dependencies

4191ae8

Cleanup of docstrings and variable names from code review

4202ff6

Rename function for clarity

3188b0b

Update with example converting back to movement

4908040

Add file validation and handling for single path

da43e87

Add preliminary tests

9d34939

Convert to numpy array

56a6672

Handle lack of confidence

b37b2c6

Display xarray

f7d48ce

Refactor tests

0606add

edeno force-pushed the main branch from 90c3287 to 0606add Compare October 25, 2024 14:04

Merge branch 'main' of https://github.com/edeno/movement

dbf804a

niksirbi mentioned this pull request Nov 6, 2024

Pick a movement-native file format #341

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I/O support for the ndx-pose NWB extension #166

I/O support for the ndx-pose NWB extension #166

edeno commented Apr 19, 2024 •

edited

Loading

codecov bot commented Apr 19, 2024 •

edited

Loading

niksirbi commented Apr 19, 2024

niksirbi commented Apr 22, 2024

sonarcloud bot commented Apr 22, 2024

niksirbi left a comment •

edited

Loading

niksirbi Apr 25, 2024

niksirbi Apr 25, 2024

niksirbi Apr 25, 2024

edeno Jun 8, 2024

niksirbi commented May 7, 2024

edeno commented May 7, 2024

niksirbi commented May 8, 2024

niksirbi commented May 21, 2024

sonarcloud bot commented Oct 25, 2024

niksirbi commented Oct 25, 2024

niksirbi commented Nov 8, 2024

luiztauffer commented Nov 12, 2024

niksirbi commented Nov 12, 2024

luiztauffer commented Nov 12, 2024

		)


		def convert_nwb_to_movement(nwb_filepaths: list[str]) -> xr.Dataset:

I/O support for the ndx-pose NWB extension #166

Are you sure you want to change the base?

I/O support for the ndx-pose NWB extension #166

Conversation

edeno commented Apr 19, 2024 • edited Loading

Description

References

How has this PR been tested?

Is this a breaking change?

Does this PR require an update to the documentation?

Checklist:

To-Do

Notes:

codecov bot commented Apr 19, 2024 • edited Loading

Codecov Report

niksirbi commented Apr 19, 2024

niksirbi commented Apr 22, 2024

sonarcloud bot commented Apr 22, 2024

Quality Gate passed

niksirbi left a comment • edited Loading

Choose a reason for hiding this comment

Tests needed

User-facing interface for loading/saving NWB files

niksirbi Apr 25, 2024

Choose a reason for hiding this comment

niksirbi Apr 25, 2024

Choose a reason for hiding this comment

niksirbi Apr 25, 2024

Choose a reason for hiding this comment

edeno Jun 8, 2024

Choose a reason for hiding this comment

niksirbi commented May 7, 2024

edeno commented May 7, 2024

niksirbi commented May 8, 2024

niksirbi commented May 21, 2024

sonarcloud bot commented Oct 25, 2024

Quality Gate failed

niksirbi commented Oct 25, 2024

niksirbi commented Nov 8, 2024

luiztauffer commented Nov 12, 2024

niksirbi commented Nov 12, 2024

luiztauffer commented Nov 12, 2024

edeno commented Apr 19, 2024 •

edited

Loading

codecov bot commented Apr 19, 2024 •

edited

Loading

niksirbi left a comment •

edited

Loading