Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stress, precision support, compute #43

Merged
merged 48 commits into from
Jun 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
ff885fc
refactor tests
Linux-cpp-lisp Dec 7, 2022
fbe345c
formatting
Linux-cpp-lisp Dec 7, 2022
1d03bdb
Stress in pair_allegro (normal/openmp)
Linux-cpp-lisp Dec 11, 2022
9c283a4
base version generalized for precision, untested for higher
anjohan Jan 23, 2023
ca4e369
attempt update Kokkos version with precision, untested
anjohan Jan 23, 2023
537c197
align default to `nequip`
Linux-cpp-lisp Jan 30, 2023
de0d6c9
Merge branch 'precision' into stress
Linux-cpp-lisp Feb 7, 2023
cddfb02
use global precision setting for virial
Linux-cpp-lisp Feb 9, 2023
4779b81
fix off-by-one type mapper bug
anjohan Feb 13, 2023
92c1756
Merge branch 'precision' of github.com:mir-group/pair_allegro into pr…
anjohan Feb 13, 2023
eda5f08
Merge branch 'precision' into stress
anjohan Feb 13, 2023
4644746
add stress code to kokkos pair style
anjohan Feb 13, 2023
8b08573
update neigh stuff to newest version of lammps, update docs
anjohan Feb 13, 2023
c7d4599
update workflow + README typo
anjohan Feb 13, 2023
e7e98a2
more README updates
anjohan Feb 13, 2023
6fda649
more README updates
anjohan Feb 13, 2023
ced2ce9
re-add tag request
anjohan Feb 13, 2023
c15014c
prefix for tests
Linux-cpp-lisp Feb 16, 2023
72f3f6c
change default kokkos precision, add ghost neigh flag
anjohan Feb 24, 2023
0f3afc1
Merge branch 'stress' of github.com:mir-group/pair_allegro into stress
anjohan Feb 24, 2023
9d7dc82
fix typo
Linux-cpp-lisp Feb 27, 2023
fede9bc
pairwise cutoffs seem to work
anjohan Apr 10, 2023
0d53720
fix empty domains (I think), limit output
anjohan Apr 14, 2023
e917edb
add padding
anjohan Apr 15, 2023
d21c7da
LAMMPS_ENV_PREFIX for all calls in the tests
Linux-cpp-lisp Jun 14, 2023
a67e814
don't use (sometimes broken) ghost nlist
Linux-cpp-lisp Jun 15, 2023
5a911bb
precision
Linux-cpp-lisp Jun 15, 2023
24d9751
README updates
Linux-cpp-lisp Jun 15, 2023
a21ce3d
cleanup torch=1.10 hacks
Linux-cpp-lisp Jun 15, 2023
3ca4b2c
fix without Kokkos
Linux-cpp-lisp Jun 15, 2023
97163be
test numerics
Linux-cpp-lisp Jun 15, 2023
0e38b3d
numerics
Linux-cpp-lisp Jun 16, 2023
4f15c4a
allow arbitrary order in hybrid/overlay for #20 and #24
anjohan Jun 27, 2023
f46669b
merge main
anjohan Oct 3, 2023
8db0e44
compute compiles, untested
anjohan Oct 3, 2023
042968f
add allreduce
anjohan Oct 4, 2023
3e0e985
add quantity request
anjohan Oct 7, 2023
c592c2a
linting happened + add checks
anjohan Feb 13, 2024
4348628
cleaning
anjohan Mar 29, 2024
156c685
remove forced C++ version
anjohan May 29, 2024
9903604
replace nequip with allegro in comments
anjohan May 29, 2024
a5c1833
update readme
anjohan May 29, 2024
645d432
add per-atom quantity extraction
anjohan May 29, 2024
e406ed3
merge
anjohan May 29, 2024
a2ac2dd
Update README.md
anjohan May 30, 2024
319c90d
update README ~according to review
anjohan May 30, 2024
1709eae
readd C++ standard update in the case of no Kokkos
anjohan May 30, 2024
89e3ce1
attempt fix empty domains also in non-Kokkos pair and compute (#45), …
anjohan Jun 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ jobs:
run: |
mkdir lammps_dir/
cd lammps_dir/
git clone -b stable_29Sep2021_update2 --depth 1 "https://github.com/lammps/lammps"
git clone --depth 1 "https://github.com/lammps/lammps"
cd ..
./patch_lammps.sh lammps_dir/lammps/
cd lammps_dir/lammps/
Expand Down
2 changes: 0 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,6 @@
*.out
*.app

.vscode



# ---------- Python .gigignores-----------
Expand Down
7 changes: 7 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"editor.formatOnSave": false,
"[python]": {
"editor.formatOnSave": true
},
"python.formatting.provider": "black"
}
69 changes: 51 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,20 @@

This pair style allows you to use Allegro models from the [`allegro`](https://github.com/mir-group/allegro) package in LAMMPS simulations. Allegro is designed to enable parallelism, and so `pair_allegro` **supports MPI in LAMMPS**. It also supports OpenMP (better performance) or Kokkos (best performance) for accelerating the pair style.

For more details on Allegro itself, background, and the LAMMPS pair style please see the [`allegro`](https://github.com/mir-group/allegro) package and our pre-print:
For more details on Allegro itself, background, and the LAMMPS pair style please see the [`allegro`](https://github.com/mir-group/allegro) package and our paper:
> *Learning Local Equivariant Representations for Large-Scale Atomistic Dynamics* <br/>
> Albert Musaelian, Simon Batzner, Anders Johansson, Lixin Sun, Cameron J. Owen, Mordechai Kornbluth, Boris Kozinsky <br/>
> https://arxiv.org/abs/2204.05249 <br/>
> https://doi.org/10.48550/arXiv.2204.05249
> https://www.nature.com/articles/s41467-023-36329-y <br/>
and
> *Scaling the leading accuracy of deep equivariant models to biomolecular simulations of realistic size* <br/>
> Albert Musaelian, Anders Johansson, Simon Batzner, Boris Kozinsky <br/>
> https://doi.org/10.1145/3581784.3627041 <br/>

`pair_allegro` authors: **Anders Johansson**, Albert Musaelian.

## Pre-requisites

* PyTorch or LibTorch >= 1.11.0
* PyTorch or LibTorch >= 1.11.0; please note that at present we have only thoroughly tested 1.11 on NVIDIA GPUs (see [#311 for NequIP](https://github.com/mir-group/nequip/discussions/311#discussioncomment-5129513)) and 1.13 on AMD GPUs, but newer 2.x versions *may* also work. With newer versions, setting the environment variable `PYTORCH_JIT_USE_NNC_NOT_NVFUSER=1` sometimes helps.

## Usage in LAMMPS

Expand All @@ -23,36 +26,56 @@ pair_coeff * * deployed.pth <Allegro type name for LAMMPS type 1> <Allegro type
where `deployed.pth` is the filename of your trained, **deployed** model.

The names after the model path `deployed.pth` indicate, in order, the names of the Allegro model's atom types to use for LAMMPS atom types 1, 2, and so on. The number of names given must be equal to the number of atom types in the LAMMPS configuration (not the Allegro model!).
The given names must be consistent with the names specified in the Allegro training YAML in `chemical_symbol_to_type` or `type_names`.
The given names must be consistent with the names specified in the Allegro training YAML in `chemical_symbol_to_type` or `type_names`. Typically, this will be the chemical symbol for each LAMMPS type.

To run with Kokkos, please see the [LAMMPS Kokkos documentation](https://docs.lammps.org/Speed_kokkos.html#running-on-gpus). Example:
```bash
mpirun -np 8 lmp -sf kk -k on g 4 -pk kokkos newton on neigh full -in in.script
```
to run on 2 nodes with 4 GPUs each.
to run on 2 nodes with 4 GPUs *each*.

### Compute
We provide an experimental "compute" that allows you to extract custom quantities from Allegro models, such as [polarization](https://arxiv.org/abs/2403.17207). You can extract either global or per-atom properties with syntax along the lines of
```
compute polarization all allegro polarization 3
compute polarizability all allegro polarizability 9
compute borncharges all allegro/atom born_charge 9 1
```

The name after `allegro[/atom]` is attempted extracted from the dictionary that the Allegro model returns. The following number is the number of elements after flattening the output. In the examples above, polarization is a 3-element global vector, while polarizability and Born charges are global and per-atom 3x3 matrices, respectively.

For per-atom quantities, the second number is a 1/0 flag indicating whether the properties should be reverse-communicated "Newton-style" like forces, which will depend on your property and the specifics of your implementation.


*Note: For extracting multiple quantities, simply use multiple commands. The properties will be extracted from the same dictionary, without any recomputation.*

*Note: The group flag should generally be `all`.*

*Note: Global quantities are assumed extensive and summed across MPI ranks. Keep ghost atoms in mind when trying to think of whether this works for your property; for example, it does not work for Allegro's global energy if there are non-zero energy shifts, as these are also applied to ghost atoms.*

## Building LAMMPS with this pair style

### Download LAMMPS
```bash
git clone -b stable_29Sep2021_update2 --depth 1 git@github.com:lammps/lammps
git clone --depth=1 https://github.com/lammps/lammps
```
or your preferred method.
(`--depth 1` prevents the entire history of the LAMMPS repository from being downloaded.)
(`--depth=1` prevents the entire history of the LAMMPS repository from being downloaded.)

### Download this repository
```bash
git clone git@github.com:mir-group/pair_allegro
git clone https://github.com/mir-group/pair_allegro
```
or by downloading a ZIP of the source.

### Patch LAMMPS
#### Automatically
From the `pair_allegro` directory, run:
```bash
./patch_lammps.sh /path/to/lammps/
```

### Libraries

#### Libtorch
If you have PyTorch installed and are **NOT** using Kokkos:
```bash
Expand All @@ -61,7 +84,7 @@ mkdir build
cd build
cmake ../cmake -DCMAKE_PREFIX_PATH=`python -c 'import torch;print(torch.utils.cmake_prefix_path)'`
```
If you don't have PyTorch installed **OR** are using Kokkos, you need to download LibTorch from the [PyTorch download page](https://pytorch.org/get-started/locally/). **Ensure you download the cxx11 ABI version.** Unzip the downloaded file, then configure LAMMPS:
If you don't have PyTorch installed **OR** are using Kokkos, you need to download LibTorch from the [PyTorch download page](https://pytorch.org/get-started/locally/). **Ensure you download the cxx11 ABI version if using Kokkos.** Unzip the downloaded file, then configure LAMMPS:
```bash
cd lammps
mkdir build
Expand All @@ -86,15 +109,15 @@ CMake will look for CUDA and cuDNN. You may have to explicitly provide the path
Note that the CUDA that comes with PyTorch when installed with `conda` (the `cudatoolkit` package) is usually insufficient (see [here](https://github.com/pytorch/extension-cpp/issues/26), for example) and you may have to install full CUDA seperately. A minor version mismatch between the available full CUDA version and the version of `cudatoolkit` is usually *not* a problem, as long as the system CUDA is equal or newer. (For example, PyTorch's requested `cudatoolkit==11.3` with a system CUDA of 11.4 works, but a system CUDA 11.1 will likely fail.) cuDNN is also required by PyTorch.

#### With OpenMP (optional, better performance)
`pair_allegro` supports the use of OpenMP to accelerate certain parts of the pair style.
`pair_allegro` supports the use of OpenMP to accelerate certain parts of the pair style, by setting `OMP_NUM_THREADS` and using the [LAMMPS OpenMP package](https://docs.lammps.org/Speed_omp.html).

#### With Kokkos (GPU, optional, best performance)
`pair_allegro` supports the use of Kokkos to accelerate certain parts of the pair style on the GPU to avoid host-GPU transfers.
#### With Kokkos (GPU, optional, best performance, most reliable)
`pair_allegro` supports the use of Kokkos to accelerate the pair style on the GPU and avoid host-GPU transfers.
`pair_allegro` supports two setups for Kokkos: pair_style and model both on CPU, or both on GPU. Please ensure you build LAMMPS with the appropriate Kokkos backends enabled for your usecase. For example, to use CUDA GPUs, add:
```
-DPKG_KOKKOS=ON -DKokkos_ENABLE_CUDA=ON
```
to your `cmake` command.
to your `cmake` command. See the [LAMMPS documentation](https://docs.lammps.org/Speed_kokkos.html) for more build options and how to correctly run LAMMPS with Kokkos.

### Building LAMMPS
```bash
Expand All @@ -106,14 +129,24 @@ This gives `lammps/build/lmp`, which can be run as usual with `/path/to/lmp -in

1. Q: My simulation is immediately or bizzarely unstable

A: Please ensure that your mapping from LAMMPS atom types to NequIP atom types, specified in the `pair_coeff` line, is correct.
A: Please ensure that your mapping from LAMMPS atom types to NequIP atom types, specified in the `pair_coeff` line, is correct, and that the units are consistent between your training data and your LAMMPS simulation.
2. Q: I get the following error:
```
instance of 'c10::Error'
what(): PytorchStreamReader failed locating file constants.pkl: file not found
```

A: Make sure you remembered to deploy (compile) your model using `nequip-deploy`, and that the path to the model given with `pair_coeff` points to a deployed model `.pth` file, **not** a file containing only weights like `best_model.pth`.
3. Q: The output pressures and stresses seem wrong / my NPT simulation is broken
3. Q: I get the following error:
```
instance of 'c10::Error'
what(): isTuple()INTERNAL ASSERT FAILED
```

A: We've seen this error occur when you try to load a TorchScript model deployed from PyTorch>1.11 in LAMMPS built against 1.11. Try redeploying your model (retraining not necessary) in a PyTorch 1.11 install.
4. Q: I get the following error:
```
Exception: Argument passed to at() was not in the map
```

A: NPT/stress support in LAMMPS for `pair_allegro` is in-progress and not yet available.
A: We now require models to have been trained with stress support, which is achieved by replacing `ForceOutput` with `StressForceOutput` in the training configuration. Note that you do not need to train on stress (though it may improve your potential, assuming your stress data is correct and converged). If you desperately wish to keep using a model without stress output, there are two options: 1) Remove lines that look like [these](https://github.com/mir-group/pair_allegro/blob/99036043e74376ac52993b5323f193dee3f4f401/pair_allegro_kokkos.cpp#L332-L343) in your version of `pair_allegro[_kokkos].cpp` 2) Redeploy the model with an updated config file, as described [here](https://github.com/mir-group/nequip/issues/69#issuecomment-1129273665).
193 changes: 193 additions & 0 deletions compute_allegro.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
/* ----------------------------------------------------------------------
LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
https://lammps.sandia.gov/, Sandia National Laboratories
Steve Plimpton, [email protected]

Copyright (2003) Sandia Corporation. Under the terms of Contract
DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
certain rights in this software. This software is distributed under
the GNU General Public License.

See the README file in the top-level LAMMPS directory.
------------------------------------------------------------------------- */

/* ----------------------------------------------------------------------
Contributing author: Anders Johansson (Harvard)
------------------------------------------------------------------------- */

#include "compute_allegro.h"
#include "atom.h"
#include "comm.h"
#include "error.h"
#include "force.h"
#include "memory.h"
#include "pair_allegro.h"
#include "update.h"

#include <cassert>
#include <cmath>
#include <cstring>
#include <iostream>
#include <numeric>
#include <sstream>
#include <string>
#include <torch/script.h>
#include <torch/torch.h>

using namespace LAMMPS_NS;

template<int peratom>
ComputeAllegro<peratom>::ComputeAllegro(LAMMPS *lmp, int narg, char **arg) : Compute(lmp, narg, arg)
{

if constexpr (!peratom) {
// compute 1 all allegro quantity length
if (narg != 5) error->all(FLERR, "Incorrect args for compute allegro");
} else {
// compute 1 all allegro/atom quantity length newton(1/0)
if (narg != 6) error->all(FLERR, "Incorrect args for compute allegro/atom");
}

if (strcmp(arg[1], "all") != 0)
error->all(FLERR, "compute allegro can only operate on group 'all'");

quantity = arg[3];
if constexpr (peratom) {
peratom_flag = 1;
nperatom = std::atoi(arg[4]);
newton = std::atoi(arg[5]);
if (newton) comm_reverse = nperatom;
size_peratom_cols = nperatom==1 ? 0 : nperatom;
nmax = -12;
if (comm->me == 0)
error->message(FLERR, "compute allegro/atom will evaluate the quantity {} of length {} with newton {}", quantity,
size_peratom_cols, newton);
} else {
vector_flag = 1;
size_vector = std::atoi(arg[4]);
if (size_vector <= 0) error->all(FLERR, "Incorrect vector length!");
memory->create(vector, size_vector, "ComputeAllegro:vector");
if (comm->me == 0)
error->message(FLERR, "compute allegro will evaluate the quantity {} of length {}", quantity,
size_vector);
}

if (force->pair == nullptr) {
error->all(FLERR, "no pair style; compute allegro must be defined after pair style");
}

((PairAllegro<lowhigh> *) force->pair)->add_custom_output(quantity);
}

template<int peratom>
void ComputeAllegro<peratom>::init()
{
;
}

template<int peratom>
ComputeAllegro<peratom>::~ComputeAllegro()
{
if (copymode) return;
if constexpr (peratom) {
memory->destroy(vector_atom);
} else {
memory->destroy(vector);
}
}

template<int peratom>
void ComputeAllegro<peratom>::compute_vector()
{
invoked_vector = update->ntimestep;

// empty domain, pair style won't store tensor
// note: assumes nlocal == inum
if (atom->nlocal == 0) {
for (int i = 0; i < size_vector; i++) {
vector[i] = 0.0;
}
} else {
const torch::Tensor &quantity_tensor =
((PairAllegro<lowhigh> *) force->pair)->custom_output.at(quantity).cpu().ravel();

auto quantity = quantity_tensor.data_ptr<double>();

if (quantity_tensor.size(0) != size_vector) {
error->one(FLERR, "size {} of quantity tensor {} does not match expected {} on rank {}",
quantity_tensor.size(0), this->quantity, size_vector, comm->me);
}

for (int i = 0; i < size_vector; i++) { vector[i] = quantity[i]; }
}

// even if empty domain
MPI_Allreduce(MPI_IN_PLACE, vector, size_vector, MPI_DOUBLE, MPI_SUM, world);
}

template<int peratom>
void ComputeAllegro<peratom>::compute_peratom()
{
invoked_peratom = update->ntimestep;

if (atom->nmax > nmax) {
nmax = atom->nmax;
memory->destroy(array_atom);
memory->create(array_atom, nmax, nperatom, "allegro/atom:array");
if (nperatom==1) vector_atom = &array_atom[0][0];
}

// guard against empty domain (pair style won't store tensor)
if (atom->nlocal > 0) {
const torch::Tensor &quantity_tensor =
((PairAllegro<lowhigh> *) force->pair)->custom_output.at(quantity).cpu().contiguous().reshape({-1,nperatom});

auto quantity = quantity_tensor.accessor<double,2>();
quantityptr = quantity_tensor.data_ptr<double>();

int nlocal = atom->nlocal;
for (int i = 0; i < nlocal; i++) {
for (int j = 0; j < nperatom; j++) {
array_atom[i][j] = quantity[i][j];
}
}
}

// even if empty domain
if (newton) comm->reverse_comm(this);
}

template<int peratom>
int ComputeAllegro<peratom>::pack_reverse_comm(int n, int first, double *buf)
{
int i, m, last;

m = 0;
last = first + n;
for (i = first; i < last; i++) {
for (int j = 0; j < nperatom; j++) {
buf[m++] = quantityptr[i*nperatom + j];
}
}
return m;
}

template<int peratom>
void ComputeAllegro<peratom>::unpack_reverse_comm(int n, int *list, double *buf)
{
int i, j, m;

m = 0;
for (i = 0; i < n; i++) {
j = list[i];
for (int k = 0; k < nperatom; k++) {
array_atom[j][k] += buf[m++];
}
}
}


namespace LAMMPS_NS {
template class ComputeAllegro<0>;
template class ComputeAllegro<1>;
}
Loading
Loading