Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matching against several annotation resources #88

Open
jorainer opened this issue Aug 29, 2022 · 2 comments
Open

Matching against several annotation resources #88

jorainer opened this issue Aug 29, 2022 · 2 comments

Comments

@jorainer
Copy link
Member

Nir (@AharoniLab) had the excellent idea that it might be good to allow matching against several reference databases in one go. The idea would be to allow calls like:

res <- matchSpectra(sps_exp, references, param = ...)

where references would be a set of reference databases against which the function would sequentially match the experimental spectra sps_exp. While an easy solution would be to use simply e.g. a list of Spectra objects as references, we thought it might be even better to introduce a further abstraction to simplify the use also for the user: instead of having to e.g. download a database or make a connection to a database, annotation source object should be used instead. These contain the information how to connect to the database and perform all the necessary steps (i.e. connect to the database or download the file, ...).

A further advantage is that this would allow to match against reference databases that are not fully open because the user will never get a Spectra object with all the reference data. Example: WeizMass: instead of needing a Spectra object with the full WeizMass data a WeizMass annotation source object is provided to matchSpectra and this object takes care of connecting to the database. As a result only matching data from WeizMass are provided, but not the full WeizMass library.

Thus, by introducing annotation source objects that don't contain or provide any data themself we could enable also matching against databases for which no full data access is possible and it could also simplify the use for the user (see example below).

Implementation notes

What I would propose is the following:

  • Define a (virtual) S4 class CompoundAnnotationSource.
  • Define a WeizMassSource S4 class extending CompoundAnnotationSource.
  • This WeizMassSource class should implement a matchSpectra method. This function should compare then the experimental spectra against WeizMass using the provided param and return the results. Connecting to the database, getting a Spectra object to enable comparison etc will all be performed within that function and the full WeizMass data will thus never be exposed to the user.
  • In addition, define a CompDbSource S4 class extending CompoundAnnotationSource. The CompDbSource constructor function takes the name of the CompDb SQLite database as input parameter. In the longer run CompDb databases (e.g. for MassBank or MoNA) should be retrieved from Bioconductor's AnnotationHub.
  • To make life for the user easier we could add a MassBank function that retrieves the CompDb for MassBank (for a certain release) from AnnotationHub and returns a CompDbSource with that data.

A call to match against WeizMass could then e.g. look like:

res <- matchSpectra(sps_exp, WeizMassSource(version = 2), param = CompareSpectraParam(ppm = 20))

Or against WeizMass and MassBank

res <- matchSpectra(
    sps_exp, 
    list(WeizMassSource(version = 2),
         MassBankSource("2022-03")),
    param = CompareSpectraParam(ppm = 20))

Input, comments etc highly welcome!

@jorainer
Copy link
Member Author

jorainer commented Sep 5, 2022

A first version and initial classes are pushed to b13ddc5 (annotation_source branch).

@jorainer
Copy link
Member Author

Note: I've now made a first PR (#89 ) to add the basic concept of the annotation sources and examples to integrate MassBank. Development to integrate WeizMass continues in the weizmass_source branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant