Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing out what partial charges could look over the CLI #1068

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
b716571
Testing out what partial charges could look over the CLI
IAlibay Jan 7, 2025
a5075eb
update that one thing
IAlibay Jan 7, 2025
020ba71
fix signature
IAlibay Jan 7, 2025
45b8025
fix up a few things
IAlibay Jan 7, 2025
c5f0371
Merge branch 'main' into cli-partial-charges
jthorton Jan 16, 2025
283d9e9
add process pool, expose charge cli command
jthorton Jan 16, 2025
8383718
update flag name, fix tests to use nagl for CI speed
jthorton Jan 17, 2025
aa53762
Update openfecli/parameters/plan_network_options.py
jthorton Jan 20, 2025
c5e5900
Update openfecli/parameters/plan_network_options.py
jthorton Jan 20, 2025
e192f95
Update openfecli/parameters/plan_network_options.py
jthorton Jan 20, 2025
43efb18
Update openfecli/parameters/plan_network_options.py
jthorton Jan 20, 2025
8602a39
move ncores cli flag, fix tests, add tests for charge generation
jthorton Jan 20, 2025
c350c19
fix edge sort dependent tests, fix mypy
jthorton Jan 20, 2025
b3ef278
fix mypy, pre-charge ligands for reproducible tests
jthorton Jan 21, 2025
c2fc143
Update openfecli/commands/generate_partial_charges.py
jthorton Jan 23, 2025
d16bc3e
Update openfecli/commands/generate_partial_charges.py
jthorton Jan 23, 2025
99a891f
set threads to 1, undo tutorial file changes
jthorton Jan 23, 2025
73fa615
add news entry
jthorton Jan 23, 2025
37add05
update openff-toolkit pin
jthorton Jan 23, 2025
c46e7c1
add charges to cofactor sdf, fix docs, expose overwrite charges to pl…
jthorton Jan 24, 2025
6ab8e9f
fix test
jthorton Jan 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/environment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ dependencies:
- nbsphinx
- nbsphinx-link
- myst-parser
- threadpoolctl
- pip:
- sphinx-design
- sphinx-toolbox
Expand Down
4 changes: 3 additions & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ dependencies:
- pyyaml
- coverage
- cinnabar ~=0.4.0
- openff-toolkit>=0.13.0
- openff-toolkit>=0.16.2
- openff-nagl-base >=0.3.3
- openff-units==0.2.0
- pint<0.22
Expand All @@ -42,6 +42,8 @@ dependencies:
- autodoc-pydantic<2.0
- pydata-sphinx-theme
- sphinx-click
# Control blas/openmp threads
- threadpoolctl
- pip:
- sphinx-toolbox
- openff-nagl-models>=0.1.2
Expand Down
23 changes: 23 additions & 0 deletions news/cli_charge_molecules.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
**Added:**

* Added a new CLI command (``charge-molecules``) to bulk assign partial charges to molecules `PR#1068 <https://github.com/OpenFreeEnergy/openfe/pull/1068>`_

**Changed:**

* The ``plan-rhfe-network`` and ``plan-rbfe-network`` CLI commands will now assign partial charges before planning the network if charges are not present, the charge assignment method can be controlled via the yaml settings file `PR#1068 <https://github.com/OpenFreeEnergy/openfe/pull/1068>`_

**Deprecated:**

* <news item>

**Removed:**

* <news item>

**Fixed:**

* <news item>

**Security:**

* <news item>
119 changes: 109 additions & 10 deletions openfe/protocols/openmm_utils/charge_generation.py
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably another issue/PR - do we want to move this module somewhere else within the repo? It feels like it's more than just "openmm_utils".

Long term I think I want to move it to "openfe-utils" or similar, that way upstream protocols can use them without depending on OpenFE.

Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import sys
import warnings
import numpy as np
from gufe import SmallMoleculeComponent
from openff.units import unit
from openff.toolkit import Molecule as OFFMol
from openff.toolkit.utils.base_wrapper import ToolkitWrapper
Expand All @@ -17,6 +18,7 @@
RDKitToolkitWrapper
)
from openff.toolkit.utils.toolkit_registry import ToolkitRegistry
from threadpoolctl import threadpool_limits

try:
import openeye
Expand Down Expand Up @@ -285,7 +287,7 @@
toolkit_backend: Literal['ambertools', 'openeye', 'rdkit'],
generate_n_conformers: Optional[int],
nagl_model: Optional[str],
) -> None:
) -> OFFMol:
"""
Assign partial charges to an OpenFF Molecule based on a selected method.

Expand All @@ -297,7 +299,7 @@
Whether or not to overwrite any existing non-zero partial charges.
Note that zeroed charges will always be overwritten.
method : Literal['am1bcc', 'am1bccelf10', 'nagl', 'espaloma']
Partial charge assignement method.
Partial charge assignment method.
Supported methods include; am1bcc, am1bccelf10, nagl, and espaloma.
toolkit_backend : Literal['ambertools', 'openeye', 'rdkit']
OpenFF toolkit backend employed for charge generation.
Expand All @@ -318,17 +320,21 @@
Raises
------
ValueError
If the ``toolkit_backend`` is not suported by the selected ``method``.
If the ``toolkit_backend`` is not supported by the selected ``method``.
If ``generate_n_conformers`` is ``None``, but the input ``offmol``
has no associated conformers.
If the number of conformers passed or generated exceeds the number
of conformers selected by the partial charge ``method``.

Returns
-------
The Molecule with partial charges assigned.
"""

# If you have non-zero charges and not overwriting, just return
if (offmol.partial_charges is not None and np.any(offmol.partial_charges)):
if not overwrite:
return
return offmol
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably going to silently break a bunch of things (particlularly protocols).

Would it be worth doing an inplace style flag where we optionally just overwrite the input molecule and otherwise return a molecule?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that as we still assign the charges to the input molecule and not a copy this should still be fine. I tried to make a small example of this here, which should hopefully mean we don't break anything!

from openff.toolkit import Molecule

def charge_mol(mol) -> Molecule:
    mol.assign_partial_charges("am1bcc")
    return mol

m = Molecule.from_smiles("CCO")
charge_mol(m)

print(m.partial_charges)

[-0.13610011111111114 0.12639988888888887 -0.5998001111111111 0.04236688888888887 0.04236688888888887 0.04236688888888887 0.04319988888888887 0.04319988888888887 0.3959998888888889] elementary_charge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry - I think I hallucinated that you had removed the offmol_copy reassignment step below.

So just to confirm, this now does both an inplace asssignment and returns the offmol itself?

I guess, do we want to have the option to not do the inplace assignment? (i.e. could there be cases where this would be useful?)

Copy link
Collaborator

@jthorton jthorton Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it still does in place but I need the return for the multiprocessing to work, we can add an option for not in place but maybe that should be in another PR?


# Dictionary for each available charge method
# The idea of this pattern is to allow for maximum flexibility by
Expand Down Expand Up @@ -408,12 +414,105 @@
generate_n_conformers=generate_n_conformers,
) # type: ignore

# Call selected method to assign partial charges
CHARGE_METHODS[method.lower()]['charge_func'](
offmol=offmol_copy,
toolkit_registry=toolkits,
**CHARGE_METHODS[method.lower()]['charge_extra_kwargs'],
) # type: ignore
# limit the number of threads used by SQM
# <https://github.com/openforcefield/openff-toolkit/issues/1831>
with threadpool_limits(limits=1):
# Call selected method to assign partial charges
CHARGE_METHODS[method.lower()]['charge_func'](
offmol=offmol_copy,
toolkit_registry=toolkits,
**CHARGE_METHODS[method.lower()]['charge_extra_kwargs'],
) # type: ignore

# Copy partial charges back
offmol.partial_charges = offmol_copy.partial_charges
return offmol


def bulk_assign_partial_charges(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Have you tested if it scales as intended?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we get a good speed up in local testing, charging 16 malt1 ligands (~42 atoms) with am1bcc (ambertools) takes:

  • 4:02 2 cores
  • 2:20 4 cores
  • 1: 48 6 cores

molecules: list[SmallMoleculeComponent],
overwrite: bool,
method: Literal['am1bcc', 'am1bccelf10', 'nagl', 'espaloma'],
toolkit_backend: Literal['ambertools', 'openeye', 'rdkit'],
generate_n_conformers: Optional[int],
nagl_model: Optional[str],
processors: int = 1,
) -> list[SmallMoleculeComponent]:
"""
Assign partial charges to a list of SmallMoleculeComponents using multiprocessing.

Parameters
----------
molecules : list[gufe.SmallMoleculeComponent]
The list of molecules who should have partial charges assigned.
overwrite : bool
Whether or not to overwrite any existing non-zero partial charges.
Note that zeroed charges will always be overwritten.
method : Literal['am1bcc', 'am1bccelf10', 'nagl', 'espaloma']
Partial charge assignment method.
Supported methods include; am1bcc, am1bccelf10, nagl, and espaloma.
toolkit_backend : Literal['ambertools', 'openeye', 'rdkit']
OpenFF toolkit backend employed for charge generation.
Supported options:
* ``ambertools``: selects both the AmberTools and RDKit Toolkit Wrapper
* ``openeye``: selects the OpenEye toolkit Wrapper
* ``rdkit``: selects the RDKit toolkit Wrapper
Note that the ``rdkit`` backend cannot be used for `am1bcc` or
``am1bccelf10`` partial charge methods.
generate_n_conformers : Optional[int]
Number of conformers to generate for partial charge generation.
If ``None`` (default), the input conformer will be used.
Values greater than 1 can only be used alongside ``am1bccelf10``.
nagl_model : Optional[str]
The NAGL model to use for charge assignment if method is ``nagl``.
If ``None``, the latest am1bcc NAGL charge model is used.
processors: int, default 1
The number of processors which should be used to generate the charges.

Raises
------
ValueError
If the ``toolkit_backend`` is not supported by the selected ``method``.
If ``generate_n_conformers`` is ``None``, but the input ``offmol``
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might actually be the case where we want to not inplace modify charges - what if things fails half way through a charge assignment loop, do we end up with have half our ligands charged and half our ligands uncharged?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If charging fails for one or more molecules the whole loop will break, so we might want to more gracefully handle failing though I am not sure what the end result would be one file with charged ligands and another with fails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth testing - could be in a follow-up (for the sake of the CLI if this fails the whole thing fails, but the API is a bit different).

has no associated conformers.
If the number of conformers passed or generated exceeds the number
of conformers selected by the partial charge ``method``.

Returns
-------
A list of SmallMoleculeComponents with the charges assigned.
"""
import tqdm

charge_keywords = {
"overwrite": overwrite,
"method": method,
"toolkit_backend": toolkit_backend,
"generate_n_conformers": generate_n_conformers,
"nagl_model": nagl_model
}
charged_ligands = []

if processors > 1:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a reminder to add a test for this.

from concurrent.futures import ProcessPoolExecutor, as_completed

Check warning on line 497 in openfe/protocols/openmm_utils/charge_generation.py

View check run for this annotation

Codecov / codecov/patch

openfe/protocols/openmm_utils/charge_generation.py#L497

Added line #L497 was not covered by tests

with ProcessPoolExecutor(max_workers=processors) as pool:

Check warning on line 499 in openfe/protocols/openmm_utils/charge_generation.py

View check run for this annotation

Codecov / codecov/patch

openfe/protocols/openmm_utils/charge_generation.py#L499

Added line #L499 was not covered by tests

work_list = [

Check warning on line 501 in openfe/protocols/openmm_utils/charge_generation.py

View check run for this annotation

Codecov / codecov/patch

openfe/protocols/openmm_utils/charge_generation.py#L501

Added line #L501 was not covered by tests
pool.submit(
assign_offmol_partial_charges,
m.to_openff(),
**charge_keywords, # type: ignore
)
for m in molecules
]

for work in tqdm.tqdm(as_completed(work_list), desc="Generating charges", ncols=80, total=len(molecules)):
charged_ligands.append(SmallMoleculeComponent.from_openff(work.result()))

Check warning on line 511 in openfe/protocols/openmm_utils/charge_generation.py

View check run for this annotation

Codecov / codecov/patch

openfe/protocols/openmm_utils/charge_generation.py#L510-L511

Added lines #L510 - L511 were not covered by tests

else:
for m in tqdm.tqdm(molecules, desc="Generating charges", ncols=80, total=len(molecules)):
mol_with_charge = assign_offmol_partial_charges(m.to_openff(), **charge_keywords) # type: ignore
charged_ligands.append(SmallMoleculeComponent.from_openff(mol_with_charge))

return charged_ligands
Binary file modified openfe/tests/data/cdk8.zip
Binary file not shown.
11 changes: 11 additions & 0 deletions openfe/tests/data/eg5/eg5_cofactor.sdf
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you adding charges here just because of the failed openeye test or because of other reasons too?

If it's just NAGL openeye issue, then I would ask that we actually keep the falling test and we can xfail it or mark it for skip. The reason is that this failure is actually a real issue that would impact our users - so keeping a test that reminds us it is a problem is very useful.

Alternatively we add it as a separate test/issue, that's ok too - I just don't want to lose visibility on this failure (this type of problem has bitten us enough times in the past).

Original file line number Diff line number Diff line change
Expand Up @@ -84,4 +84,15 @@
26 27 1 0 0 0
M CHG 3 3 -1 4 -1 7 -1
M END

> <ofe-name>
3L9H

> <atom.dprop.PartialCharge>
1.3988487435897436 -0.91321825641025633 -0.91321825641025633 -0.91321825641025633 1.5677487435897437 -0.67705125641025643 -0.67705125641025643 -0.77085125641025642 -0.59125125641025633
0.1833487435897436 0.088048743589743572 -0.42865125641025642 0.064048743589743579 -0.59785125641025638 0.080048743589743579 -0.6198512564102564 0.27474874358974355 -0.28105125641025647
0.35164874358974357 -0.76515125641025639 -0.11125125641025642 0.54814874358974364 -0.89255125641025634 -0.77805125641025641 0.56384874358974357 -0.71505125641025635 0.38374874358974354
0.06664874358974357 0.06664874358974357 0.034648743589743576 0.16864874358974358 0.38094874358974357 0.11064874358974358 0.40294874358974359 0.042648743589743576 0.15174874358974358
0.35724874358974357 0.35724874358974357 0.0010487435897435762

$$$$
86 changes: 86 additions & 0 deletions openfe/tests/data/openmm_rfe/dummy_charge_ligand_23.sdf
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
ligand_23
RDKit 3D

36 38 0 0 0 0 0 0 0 0999 V2000
-1.9600 21.5500 -27.3300 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.2900 22.4100 -28.2000 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.9600 22.9500 -29.2900 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.3000 22.6200 -29.5400 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.9700 21.7600 -28.6500 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.2900 21.2300 -27.5600 C 0 0 0 0 0 0 0 0 0 0 0 0
-5.6400 21.3500 -28.9200 Cl 0 0 0 0 0 0 0 0 0 0 0 0
-4.0400 23.1500 -30.7100 C 0 0 0 0 0 0 0 0 0 0 0 0
-4.2300 22.4200 -31.6700 O 0 0 0 0 0 0 0 0 0 0 0 0
-4.5100 24.4100 -30.6500 N 0 0 0 0 0 0 0 0 0 0 0 0
-5.2900 25.1100 -31.5800 C 0 0 0 0 0 0 0 0 0 0 0 0
-5.9600 24.5500 -32.6800 C 0 0 0 0 0 0 0 0 0 0 0 0
-6.7800 25.3600 -33.4600 C 0 0 0 0 0 0 0 0 0 0 0 0
-6.9200 26.6500 -33.1800 N 0 0 0 0 0 0 0 0 0 0 0 0
-6.3000 27.2300 -32.1600 C 0 0 0 0 0 0 0 0 0 0 0 0
-5.5100 26.4600 -31.3100 C 0 0 0 0 0 0 0 0 0 0 0 0
-6.5400 28.5800 -31.8800 N 0 0 0 0 0 0 0 0 0 0 0 0
-5.7100 29.4500 -31.2300 C 0 0 0 0 0 0 0 0 0 0 0 0
-4.6200 29.1200 -30.8100 O 0 0 0 0 0 0 0 0 0 0 0 0
-6.2300 30.8500 -31.0300 C 0 0 1 0 0 0 0 0 0 0 0 0
-5.5800 31.7500 -29.9800 C 0 0 0 0 0 0 0 0 0 0 0 0
-5.3600 32.0200 -31.4700 C 0 0 1 0 0 0 0 0 0 0 0 0
-4.1100 31.7200 -32.0100 F 0 0 0 0 0 0 0 0 0 0 0 0
-1.1100 24.0500 -30.3300 Cl 0 0 0 0 0 0 0 0 0 0 0 0
-7.3100 30.9400 -31.1600 H 0 0 0 0 0 0 0 0 0 0 0 0
-5.8600 32.9000 -31.8800 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.4400 21.1300 -26.4800 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.2600 22.6600 -28.0200 H 0 0 0 0 0 0 0 0 0 0 0 0
-3.8100 20.5600 -26.8900 H 0 0 0 0 0 0 0 0 0 0 0 0
-4.3200 24.9800 -29.8400 H 0 0 0 0 0 0 0 0 0 0 0 0
-5.8400 23.5000 -32.9100 H 0 0 0 0 0 0 0 0 0 0 0 0
-7.3100 24.9400 -34.3000 H 0 0 0 0 0 0 0 0 0 0 0 0
-5.0600 26.9100 -30.4300 H 0 0 0 0 0 0 0 0 0 0 0 0
-7.4000 29.0100 -32.1800 H 0 0 0 0 0 0 0 0 0 0 0 0
-6.3500 32.2000 -29.3600 H 0 0 0 0 0 0 0 0 0 0 0 0
-4.9200 31.1500 -29.3500 H 0 0 0 0 0 0 0 0 0 0 0 0
13 32 1 0
13 14 1 0
12 13 2 0
14 15 2 0
12 31 1 0
11 12 1 0
17 34 1 0
15 17 1 0
15 16 1 0
22 23 1 0
17 18 1 0
22 26 1 6
8 9 2 0
11 16 2 0
10 11 1 0
20 22 1 0
21 22 1 0
16 33 1 0
18 20 1 0
18 19 2 0
20 25 1 6
20 21 1 0
8 10 1 0
4 8 1 0
10 30 1 0
3 24 1 0
21 35 1 0
21 36 1 0
3 4 2 0
4 5 1 0
2 3 1 0
5 7 1 0
5 6 2 0
2 28 1 0
1 2 2 0
1 6 1 0
6 29 1 0
1 27 1 0
M END
> <atom.dprop.PartialCharge> (1)
-0.097000000000000003 -0.1295 0.050899999999999994 -0.1426 0.050899999999999994 -0.1295 -0.058400000000000007 0.68969999999999998 -0.53810000000000002 -0.46510000000000001 0.1236
-0.32829999999999998 0.44219999999999998 -0.72699999999999998 0.54220000000000002 -0.27029999999999998 -0.5464 0.69610000000000005 -0.56910000000000005 -0.23069999999999999
-0.13339999999999999 0.12759999999999999 -0.20530000000000001 -0.058400000000000007 0.1237 0.1027 0.14799999999999999 0.1565 0.1565 0.32550000000000001 0.14199999999999999
0.022099999999999995 0.20300000000000001 0.33750000000000002 0.094200000000000006 0.094200000000000006

$$$$
Loading
Loading