-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in h5checktype(). The provided H5Identifier is not a dataset identifier. #325
Comments
Hi @JHYSiu, could you provide us with a reproducible example, using a file/dataset that we can access? I couldn't tell what you were doing when the error occurred. Note, the MuData converters were written by the MuData team and not by us, so we may not be able to help, but still I'd like to see a dataset and code to see what I can do. |
Thanks! Here's a massively subsetted example. Python export code: R import code: R session info
Matrix products: default locale: attached base packages: other attached packages: loaded via a namespace (and not attached): |
Thanks for uploading the example. I have a feeling that the MuData package is going to need some enhancements to work with your use case. Here's how I look at the situation. In the MuData package there is an example that produces an h5mu file from a TCGA MultiAssayExperiment, called miniacc.h5mu. We can use the h5ls utility (get from hdfgroup or brew etc. if you don't already have it) to look at the layout:
That's the top level Group list. Then drill down:
We can see that as we descend in the Group hierarchy we find Dataset instances named X. For your example, which was generated using python and not MuData::writeH5MU, we see
I am not sure that the MuData readH5MU is suited for this structure. You should contact the MuData authors for clarification. In their DESCRIPTION I see
implying that the round trips must start with MAE. Here's where to post your question: https://github.com/ilia-kats/MuData/issues Finally I do not think it will be difficult to dig data out of your .h5mu file using rhdf5 and/or reticulate with h5py imported. You might post a query to support.bioconductor.org where someone may have already tackled this. |
Thanks for looking into this @vjcitn! Helpful for me too. |
For what it's worth the (under development github) anndataR package provides some useful building blocks for a 'native' R parser, and the anndata / mudata structures are very similar. Here's some code that starts down the path (the RNA experiment, obs and assays only...; this uses the 'devel' version of Bioconductor and hence rhdf5, which includes an important extension for managing HDF5 'enum' types).
It's interesting / unfortunate that constructing a MAE loads most of the data (even if the 'large' matrix data were to be left on-disk via TENxMatrix) into memory... |
Excellent thank you everyone. I will have a play around and see if I can get it to work! I'll let you know if I come up with anything robust. :) |
Hi
Sorry, I'm unfamiliar with converting between MuData and R format.
I am unable to load my h5mu data in. I keep getting "Error in h5checktype(). The provided H5Identifier is not a dataset identifier." I'm not sure how to fix this. Thanks!
From the rhdf5::H5Fopen, it appears to still match the format from the dummy example data.
I exported my h5mu with just mdata.write("FILENAME.h5mu")
The text was updated successfully, but these errors were encountered: