Skip to content
/ PES-Learn Public

Automated Generation of Machine Learning Models of Molecular Potential Energy Surfaces

License

Notifications You must be signed in to change notification settings

CCQC/PES-Learn

Repository files navigation

PES-Learn

Continuous Integration License

PES-Learn is a Python library designed to fit system-specific Born-Oppenheimer potential energy surfaces using modern machine learning models. PES-Learn assists in generating datasets, and features Gaussian process and neural network model optimization routines. The goal is to provide high-performance models for a given dataset without requiring user expertise in machine learning.

This project is young and under active development. It is recommended to take a look at the Tutorials, the FAQ page, and the list of keyword options before using PES-Learn for research purposes. More documentation will be added periodically. Questions and comments are encouraged; please consider submitting an issue.

Features

  • Ease of Use

    • PES-Learn can be run by writing an input file and running the code (much like most electronic structure theory packages)
    • PES-Learn also features a Python API for more advanced workflows
    • Once ML models are finished training, PES-Learn automatically writes a Python file containing a function for evaluating the energies at new geometries.
  • Data Generation

    • PES-Learn supports input file generation and output file parsing for arbitrary electronic structure theory packages such as Psi4, Molpro, Gaussian, NWChem, etc.
    • Data is generated with user-defined internal coordinate displacements with support for:
      • Redundant geometry removal
      • Configuration space filtering
  • Automated Data Transformation

    • Rotation, translation, and permutation invariant molecular geometry representations
  • Automated Machine Learning Model Generation

    • Neural network models are built using PyTorch
    • Gaussian process models are built using GPy
  • Hyperparameter Optimization

Installation Instructions

PES-Learn has been tested and developed on Linux, Mac, and Windows 10 through the Windows Subsystem for Linux. To install using pip:
Clone the repository:
git clone https://github.com/adabbott/PES-Learn.git
Change into top-level directory:
cd PES-Learn
Install PES-Learn and all dependencies:
python setup.py install
To avoid having to re-install the package whenever a change is made to the code, run pip install -e .

Install using an Anaconda environment

The above procedure works just fine, however for performance and stability, we recommend installing all PES-Learn dependencies in a clean Anaconda environment. After installing Anaconda for Python3, create and activate an environment:
conda create -n peslearn python=3.6
conda activate peslearn
The required dependencies can be installed in one line:
conda install -c conda-forge -c pytorch -c omnia gpy pytorch scikit-learn pandas hyperopt cclib
Then install the PES-Learn package:
git clone https://github.com/adabbott/PES-Learn.git
python setup.py install
pip install -e .

To update PES-Learn in the future, run git pull while in the top-level directory PES-Learn.

To run the test suite, you need pytest: pip install pytest-cov To run tests, in the top-level directory called PES-Learn, run: py.test -v tests/

Citing PES-Learn

PES-Learn: An Open-Source Software Package for the Automated Generation of Machine Learning Models of Molecular Potential Energy Surfaces

Bibtex:

@article{Abbott2019,
  title={PES-Learn: An open-source software package for the automated generation of machine learning models of molecular potential energy surfaces},
  author={Abbott, Adam S and Turney, Justin M and Zhang, Boyi and Smith, Daniel GA and Altarawy, Doaa and Schaefer III, Henry F},
  journal={Journal of chemical theory and computation},
  volume={15},
  number={8},
  pages={4386--4398},
  year={2019},
  publisher={ACS Publications}
}

Funding

This project is a collaboration with the Molecular Sciences Software Institute. The author gratefully acknowledges MolSSI for funding the development of this software.

About

Automated Generation of Machine Learning Models of Molecular Potential Energy Surfaces

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published