Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calculateMass: clarify or enhance behaviour for charged formulas #59

Open
meowcat opened this issue Jan 30, 2023 · 8 comments
Open

calculateMass: clarify or enhance behaviour for charged formulas #59

meowcat opened this issue Jan 30, 2023 · 8 comments

Comments

@meowcat
Copy link
Contributor

meowcat commented Jan 30, 2023

Currently calculateMass("[C6H12]+") == calculateMass("C6H12")
Strictly, this should be calculateMass("[C6H12]+") == calculateMass("C6H12") - .emass where .emass == 0.0005485799
Questions:

  • is calculateMass("[C6H12]+") out-of-scope for this function, and should this throw a warning or error?
  • otherwise, should it give the correct result?
  • or, should there be an argument calculateMass(x, charge)?

Note: the rcdk equivalent is get.formula(formula, charge)@mass . Notably, get.formula("[C6H6]+") also does not account for the charge (but creates the neutral formula). However here the user can specify the charge manually.

@jorainer
Copy link
Member

@michaelwitting

@michaelwitting
Copy link
Collaborator

Not out of scope. We can add changes for this. In my opion charged formulas should be supplied exactly in the format you mentioned, e.g. [C6H12]+. Multiple charges would be 2+, 3+ or 2-.
@jorainer would that be okay as input?

@jorainer
Copy link
Member

IMHO it would be best if calculateMass would identify the charge automatically and return the correct mass.

maybe best to have one function

  • charge that takes formulas (character) and returns their charges (integer) - (function name might not be ideal though).
  • maybe another function that strips charges from the formulas (stripCharge)? - and along that line, maybe also one that adds charges to formulas (so that we can ensure that they have the correct format)

happy for a PR :)

@michaelwitting
Copy link
Collaborator

I can fiddle around a bit, but what do we want as input?

  • single string containing everything, e.g. [C6H12]+
  • two parameters for calculateMass, e.g. calculateMass("C6H12", charge = "1+")

I would prefer the first option

@jorainer
Copy link
Member

I would also say the first option (i.e. "[C6H12]+") - ideally also supporting a character vector of several formulas and not just a single one.

@michaelwitting
Copy link
Collaborator

The way I think it through now, calculateMass() would in future only accept character and no preparsed lists. Except we put the parsing of charges in the countElements() function

@jorainer
Copy link
Member

we also have isotope detection/parsing in there (calculateMass? countElements?), right? maybe would make most sense to do that in the same function?

@jorainer
Copy link
Member

don't know if the previous comment made sense... what I mean: put that into the same function that does already the parsing to avoid needing to parse the input twice (once e.g. counting elements and once getting the charge). but this will also depend a bit how complicated it would be - if it's too difficult or blows up the code we could do it in two separate steps (i.e. a dedicated function that retrieves the charge from a character). maybe it would even be good to just focus first on the latter, to have a clean (separate) implementation to derive the charge from a character so that we can test and then later improve?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants