-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large DFT forces and strange geometries #105
Comments
Where did you get these from? Anything with a force >1 was stripped out when we generated the HDF5 file for the dataset. |
These came from the latest SPICE v2.0.1 HDF5 file on zenodo, positions and arrays were extracted to xyz |
Can you provide the group names and conformation indices so I can look them up? |
I mean the name of the top level group within the HDF5 file. So I can look them up in the file. |
That makes sense. I assumed your file had the same units as the original dataset. The cutoff we applied to forces is 1 hartree/bohr, which is 51.4 eV/Å. Anything less than that is expected to still be present. I looked through a few of the molecules you listed and didn't see any detached hydrogens like that. But I did see some mangled looking molecules, like this distorted ring in You might choose to apply a lower cutoff to forces to get rid of things like this. Strictly speaking they're still correct: the DFT calculation was run correctly for the given conformations. But you might decide you don't want to train on conformations that are that unrealistic. |
I've also noticed these issues in the amino acids subset, it looks to me like an atom rearrangement error. Could it be the result of this collection of bugs in the barostat used to generate the structures? openmm/openmm#4364 |
There wasn't any barostat used in generating these conformations. We just extracted an isolated ligand and amino acid from a PDB file, added harmonic restraints to prevent them from moving too much, and energy minimized. Here is the script used to generate them if you want to see the details. Any unrealistic conformations probably come from flaws in the force fields used for the energy minimization (OpenFF for the ligand, Amber ff14 for the amino acid). |
I see, thanks for the clarification, sorry about the confusion. The conformer generation wasn't outlined in the paper, so I was assuming it was the same process as e.g. the dipeptides or pub chem molecules. |
It's described in the SPICE 2 paper: https://pubs.acs.org/doi/abs/10.1021/acs.jctc.4c00794. I should update the README to add that reference. |
Oh thanks for pointing me toward that, not sure how I missed it. |
Perhaps because we forgot to add it to the README. :) |
Hi,
Whilst inspecting some of the new subsets that were added in version 2, I came across some configurations where the hydrogens appear to have been ripped off their heavy atoms, and the forces from DFT are extremely high. I have attached some examples that appear when filtering the amino acid-ligand subset by max force. My understanding was that some of these configurations with high forces were present in the original dataset due to the psi4 bug but was not expecting them to be present in the more recently computed values.
spice_2_amino_acid_ligand_high_dft_forces.tar.gz
The text was updated successfully, but these errors were encountered: