Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Softlinks does not enforce the correct type, nor is it required to add a linked type as a softlink #559

Open
2 tasks done
ehennestad opened this issue Feb 1, 2024 · 5 comments
Labels
category: enhancement improvements of code or code behavior status: todo something needs to be done topic: matnwb-api related to improving the matnwb api

Comments

@ehennestad
Copy link
Collaborator

What happened?

This issue related to the following points:

  1. I can add a linked type directly to an NWB type/class without using the types.untyped.Softlink.
    Example (adding a device):
% Create a device
device = types.core.Device();
nwb.general_devices.set('Device', device);

imaging_plane = types.core.ImagingPlane( ...
    'description', 'a very interesting part of the brain', ...
    'device', device, ...
  1. I can add another type to the device property using a SoftLink class.

Example:

% Create an optical channel.
optical_channel_green = types.core.OpticalChannel( ...
    'description', 'green', ...
    'emission_lambda', 500.);
    
 imaging_plane = types.core.ImagingPlane( ...
    'description', 'a very interesting part of the brain', ...
    'device', types.untyped.SoftLink(optical_channel_green), ...

In both cases the nwb export works fine.

This issue might be related to this one:
#48

The concept of h5 links / softlinks might be too specific for the average user to care about so maybe it would be better to handle this internally?

I.e in the set method for a linked type, if the value is not a types.untyped.SoftLink convert it to one? This would take away the burden from the user to check whether a type is linked or "embedded" (i.e an optical channel is "embedded", not linked as far as I can see). It would also ensure that it's harder to add types that are not correct, as in point 2.

Further, if the user provides an object of type types.untyped.SoftLink, the validator should check that the target of the types.untyped.SoftLink is of the correct type.

See here for three .m scripts to reproduce:
https://www.dropbox.com/scl/fo/bdvm4tgc79n2w4sn4f2ij/h?rlkey=wccijdyeaxm1b5gonzwmcu0w2&dl=0

Steps to Reproduce

See linked files above

Error Message

No errors

Operating System

macOS

Matlab Version

MATLAB Version: 23.2.0.2409890 (R2023b) Update 3
MATLAB License Number: 885740
Operating System: macOS Version: 13.5 Build: 22G74
Java Version: Java 1.8.0_392-b08 with Amazon.com Inc. OpenJDK 64-Bit Server VM mixed mode

Code of Conduct

@ehennestad
Copy link
Collaborator Author

Also, is there a reason a softlink is only being resolved when the NWB file is being exported, and not when the SoftLink is created in the first place?

@lawrence-mbf
Copy link
Collaborator

Also, is there a reason a softlink is only being resolved when the NWB file is being exported, and not when the SoftLink is created in the first place?

I actually don't know if softlink is ever technically validated. HDF5 certainly doesn't care if you write an invalid one.

Take this script for example:
fid = H5F.create('test.h5');

sl1 = types.untyped.SoftLink('/dne1');
sl1.export(fid, '/link1', {});
sl2 = types.untyped.SoftLink('/dne2'); 
sl2.export(fid, '/link2', {});
sl3 = types.untyped.SoftLink('/dne3'); 
sl3.export(fid, '/link3', {});

H5F.close(fid);

This creates a valid hdf5 file, you just get missed target paths in HDFView.

Screenshot 2024-02-01 100600

That said, you cannot actually validate if an object is "pointable" by HDF5 path until you actually write the object to the same file as the soft link which is probably why any kind of validation is done at export level. Certainly, though, someone could add some validation on the type of object when passing softlinks in, but this is also only the case if the softlink was provided with a target object instead of a raw path (some old matnwb scripts probably still use this).

I do like the idea of auto-coercing to a SoftLink but I would argue that explicitly defining the SoftLink lets the user know that the data is actually not written to disk if wrapped in one. The user still has to embed the file somewhere or you will produce a NWB file with invalid links.

Note that I will not be doing much more work on MatNWB but I leave the repo open in case PRs come in. I would imagine that the repo itself will be archived at some point in the future.

@ehennestad
Copy link
Collaborator Author

ehennestad commented Feb 1, 2024

That said, you cannot actually validate if an object is "pointable" by HDF5 path until you actually write the object to the same file as the soft link which is probably why any kind of validation is done at export level.

I didn't look into this in detail, but I imagine you could easily modify the NWB File object to store an internal cache or map of all the objects that have been added to it.

I.e, when I do nwb.general_devices.set('Device', device) that could be flagged somehow on the nwb file object, and when I later add a softlink, it could validate against the objects that are already cached/flagged.

I can imagine it is not foolproof, but the alternative is to allow types to be added as groups instead of links or even adding the completely wrong types, and this is a violation of the nwb schemas.

I would argue that explicitly defining the SoftLink lets the user know that the data is actually not written to disk if wrapped in one. The user still has to embed the file somewhere or you will produce a NWB file with invalid links.

I don't think anyone who is not very familiar with h5 would catch this nuance.

Note that I will not be doing much more work on MatNWB but I leave the repo open in case PRs come in. I would imagine that the repo itself will be archived at some point in the future.

Oh, that's unfortunate, I hope there will be a way to keep it alive still!

@lawrence-mbf
Copy link
Collaborator

lawrence-mbf commented Feb 1, 2024

I didn't look into this in detail, but I imagine you could easily modify the NWB File object to store an internal cache or map of all the objects that have been added to it.

With the way that MatNWB has designed assignments, keeping a map at the highest level will only work if the assignment includes the NwbFile object i.e. nwb.acquisition.set('acq', obj);. But this does not work for anything more advanced object compositions like, say Dynamic Tables. You can easily construct a DynamicTable without touching the nwb object at all (which would not be caught by subsref). I suppose you could have all classes do this and have the nwb object just merge the maps at the top when the top-level subsref receives the object. Then you could check broken soft links prior to export.

...but the alternative is to allow types to be added as groups instead of links or even adding the completely wrong types, and this is a violation of the nwb schemas.

I agree, this is an oversight in design of SoftLinks. Some low-hanging fruit for development here is to just generalize the ExternalLink validation (which ironically has better and more thorough checks) so at least the wrong type is not passed in.

@ehennestad
Copy link
Collaborator Author

Pt 2 from the first post is also true for types.untyped.ObjectView

Also here there is an opportunity for hiding the implementation of ObjectView as suggested for Softlinks

I.e in the set method for a linked type, if the value is not a types.untyped.SoftLink convert it to one? This would take away the burden from the user to check whether a type is linked or "embedded" (i.e an optical channel is "embedded", not linked as far as I can see). It would also ensure that it's harder to add types that are not correct, as in point 2.

@ehennestad ehennestad added category: enhancement improvements of code or code behavior status: todo something needs to be done topic: matnwb-api related to improving the matnwb api labels Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: enhancement improvements of code or code behavior status: todo something needs to be done topic: matnwb-api related to improving the matnwb api
Projects
None yet
Development

No branches or pull requests

2 participants