You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello everyone. I met an error when I use the finetuned DPA2 model in the lammps MC simulation. The error informations is as below, I don't know what caused this.I'd appreciate it if you could help me with this.
DeePMD-kit Version
DeePMD-kit v3.0.0b4
Backend and its version
PyTorch v2.0.0.post200-gc263bd43e8e
How did you download the software?
Offline packages
Input Files, Running Commands, Error Log, etc.
ERROR on proc 2: DeePMD-kit C API Error: DeePMD-kit Error: DeePMD-kit PyTorch backend error: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/deepmd/pt/model/model/ener_model.py", line 56, in forward_lower
comm_dict: Optional[Dict[str, Tensor]]=None) -> Dict[str, Tensor]:
_5 = (self).need_sorted_nlist_for_lower()
model_ret = (self).forward_common_lower(extended_coord, extended_atype, nlist, mapping, fparam, aparam, do_atomic_virial, comm_dict, _5, )
~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_6 = (self).get_fitting_net()
model_predict = annotate(Dict[str, Tensor], {})
File "code/torch/deepmd/pt/model/model/ener_model.py", line 213, in forward_common_lower
cc_ext, _36, fp, ap, input_prec, = _35
atomic_model = self.atomic_model
atomic_ret = (atomic_model).forward_common_atomic(cc_ext, extended_atype, nlist0, mapping, fp, ap, comm_dict, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_37 = (self).atomic_output_def()
training = self.training
File "code/torch/deepmd/pt/model/atomic_model/energy_atomic_model.py", line 50, in forward_common_atomic
ext_atom_mask = (self).make_atom_mask(extended_atype, )
_3 = torch.where(ext_atom_mask, extended_atype, 0)
ret_dict = (self).forward_atomic(extended_coord, _3, nlist, mapping, fparam, aparam, comm_dict, )
~~~~~~~~~~~~~~~~~~~~ <--- HERE
ret_dict0 = (self).apply_out_stat(ret_dict, atype, )
_4 = torch.slice(torch.slice(ext_atom_mask), 1, None, nloc)
File "code/torch/deepmd/pt/model/atomic_model/energy_atomic_model.py", line 93, in forward_atomic
pass
descriptor = self.descriptor
_16 = (descriptor).forward(extended_coord, extended_atype, nlist, mapping, comm_dict, )
~~~~~~~~~~~~~~~~~~~ <--- HERE
descriptor0, rot_mat, g2, h2, sw, = _16
fitting_net = self.fitting_net
File "code/torch/deepmd/pt/model/descriptor/dpa2.py", line 84, in forward
repformers3 = self.repformers
_17 = nlist_dict[_1(_16, (repformers3).get_nsel(), )]
_18 = (repformers1).forward(_17, extended_coord, extended_atype, g11, mapping0, comm_dict0, )
~~~~~~~~~~~~~~~~~~~~ <--- HERE
g12, g2, h2, rot_mat, sw, = _18
concat_output_tebd = self.concat_output_tebd
File "code/torch/deepmd/pt/model/descriptor/repformers.py", line 226, in forward
_32 = torch.tensor(nloc)
_33 = torch.tensor(torch.sub(nall, nloc))
ret = ops.deepmd.border_op(_25, _26, _27, _28, _29, g10, _31, _32, _33)
~~~~~~~~~~~~~~~~~~~~ <--- HERE
g1_ext, comm_dict6, mapping6 = torch.unsqueeze(ret[0], 0), comm_dict7, mapping2
_34 = (_00).forward(g1_ext, g23, h2, nlist0, nlist_mask, sw1, )
Traceback of TorchScript, original code (most recent call last):
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/model/ener_model.py", line 109, in forward_lower
comm_dict: Optional[Dict[str, torch.Tensor]] = None,
):
model_ret = self.forward_common_lower(
~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
extended_coord,
extended_atype,
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/model/make_model.py", line 261, in forward_common_lower
)
del extended_coord, fparam, aparam
atomic_ret = self.atomic_model.forward_common_atomic(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
cc_ext,
extended_atype,
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/atomic_model/base_atomic_model.py", line 241, in forward_common_atomic
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/atomic_model/dp_atomic_model.py", line 189, in forward_atomic
if self.do_grad_r() or self.do_grad_c():
extended_coord.requires_grad_(True)
descriptor, rot_mat, g2, h2, sw = self.descriptor(
~~~~~~~~~~~~~~~ <--- HERE
extended_coord,
extended_atype,
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/descriptor/dpa2.py", line 652, in forward
g1 = g1_ext
# repformer
g1, g2, h2, rot_mat, sw = self.repformers(
~~~~~~~~~~~~~~~ <--- HERE
nlist_dict[
get_multiple_nlist_key(
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/descriptor/repformers.py", line 480, in forward
assert "recv_num" in comm_dict
assert "communicator" in comm_dict
ret = torch.ops.deepmd.border_op(
~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
comm_dict["send_list"],
comm_dict["send_proc"],
RuntimeError: Trying to create tensor with negative dimension -1873441304: [-1873441304]
(/home/conda/feedstock_root/build_artifacts/deepmd-kit_1722057353391/work/source/lmp/pair_deepmd.cpp:586)
Last command: run 150000
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
Steps to Reproduce
the lammps in.file is as follow:
label i
variable i loop 2
variable ts equal 0+300*$i
variable ta equal 0+300*$i
shell mkdir dpav1-${ta}
units metal
boundary p p p
atom_style atomic
timestep 0.001
read_data min.data
pair_style deepmd ../dpav1.pth
pair_coeff * * x x x
compute 1 all temp
compute Ek all ke/atom
compute Ep all pe/atom
compute_modify 1 dynamic yes
thermo_style custom step dt time temp ke pe etotal press lx ly lz vol
thermo 100
dump 1 all custom 5000 dpav1-${ta}/dumpthermo.atom.* id type x y z c_Ek c_Ep
velocity all create ${ts} 82765577 rot yes dist gaussian
fix r2 all npt temp ${ta} ${ta} 0.1 iso 0.0 0.0 1.0
fix mc4 all atom/swap 20 5 82765577 ${ts} types 1 2
fix mc5 all atom/swap 20 5 82765577 ${ts} types 1 3
fix mc6 all atom/swap 20 5 82765577 ${ts} types 2 3
run 100000
min_style cg
minimize 1.0e-6 1.0e-7 10000 10000
clear
next i
jump SELF i
Further Information, Files, and Links
No response
The text was updated successfully, but these errors were encountered:
有多少个原子?它看起来像一个整数溢出错误。您能否提供文件来重现该错误?
108 Here are the files.file.zip
I did not reproduce the error of this input on current devel branch with both single and multiprocess execution. It seems that this issue may be fixed on devel branch
Bug summary
Hello everyone. I met an error when I use the finetuned DPA2 model in the lammps MC simulation. The error informations is as below, I don't know what caused this.I'd appreciate it if you could help me with this.
DeePMD-kit Version
DeePMD-kit v3.0.0b4
Backend and its version
PyTorch v2.0.0.post200-gc263bd43e8e
How did you download the software?
Offline packages
Input Files, Running Commands, Error Log, etc.
ERROR on proc 2: DeePMD-kit C API Error: DeePMD-kit Error: DeePMD-kit PyTorch backend error: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/deepmd/pt/model/model/ener_model.py", line 56, in forward_lower
comm_dict: Optional[Dict[str, Tensor]]=None) -> Dict[str, Tensor]:
_5 = (self).need_sorted_nlist_for_lower()
model_ret = (self).forward_common_lower(extended_coord, extended_atype, nlist, mapping, fparam, aparam, do_atomic_virial, comm_dict, _5, )
~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_6 = (self).get_fitting_net()
model_predict = annotate(Dict[str, Tensor], {})
File "code/torch/deepmd/pt/model/model/ener_model.py", line 213, in forward_common_lower
cc_ext, _36, fp, ap, input_prec, = _35
atomic_model = self.atomic_model
atomic_ret = (atomic_model).forward_common_atomic(cc_ext, extended_atype, nlist0, mapping, fp, ap, comm_dict, )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_37 = (self).atomic_output_def()
training = self.training
File "code/torch/deepmd/pt/model/atomic_model/energy_atomic_model.py", line 50, in forward_common_atomic
ext_atom_mask = (self).make_atom_mask(extended_atype, )
_3 = torch.where(ext_atom_mask, extended_atype, 0)
ret_dict = (self).forward_atomic(extended_coord, _3, nlist, mapping, fparam, aparam, comm_dict, )
~~~~~~~~~~~~~~~~~~~~ <--- HERE
ret_dict0 = (self).apply_out_stat(ret_dict, atype, )
_4 = torch.slice(torch.slice(ext_atom_mask), 1, None, nloc)
File "code/torch/deepmd/pt/model/atomic_model/energy_atomic_model.py", line 93, in forward_atomic
pass
descriptor = self.descriptor
_16 = (descriptor).forward(extended_coord, extended_atype, nlist, mapping, comm_dict, )
~~~~~~~~~~~~~~~~~~~ <--- HERE
descriptor0, rot_mat, g2, h2, sw, = _16
fitting_net = self.fitting_net
File "code/torch/deepmd/pt/model/descriptor/dpa2.py", line 84, in forward
repformers3 = self.repformers
_17 = nlist_dict[_1(_16, (repformers3).get_nsel(), )]
_18 = (repformers1).forward(_17, extended_coord, extended_atype, g11, mapping0, comm_dict0, )
~~~~~~~~~~~~~~~~~~~~ <--- HERE
g12, g2, h2, rot_mat, sw, = _18
concat_output_tebd = self.concat_output_tebd
File "code/torch/deepmd/pt/model/descriptor/repformers.py", line 226, in forward
_32 = torch.tensor(nloc)
_33 = torch.tensor(torch.sub(nall, nloc))
ret = ops.deepmd.border_op(_25, _26, _27, _28, _29, g10, _31, _32, _33)
~~~~~~~~~~~~~~~~~~~~ <--- HERE
g1_ext, comm_dict6, mapping6 = torch.unsqueeze(ret[0], 0), comm_dict7, mapping2
_34 = (_00).forward(g1_ext, g23, h2, nlist0, nlist_mask, sw1, )
Traceback of TorchScript, original code (most recent call last):
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/model/ener_model.py", line 109, in forward_lower
comm_dict: Optional[Dict[str, torch.Tensor]] = None,
):
model_ret = self.forward_common_lower(
~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
extended_coord,
extended_atype,
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/model/make_model.py", line 261, in forward_common_lower
)
del extended_coord, fparam, aparam
atomic_ret = self.atomic_model.forward_common_atomic(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
cc_ext,
extended_atype,
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/atomic_model/base_atomic_model.py", line 241, in forward_common_atomic
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/atomic_model/dp_atomic_model.py", line 189, in forward_atomic
if self.do_grad_r() or self.do_grad_c():
extended_coord.requires_grad_(True)
descriptor, rot_mat, g2, h2, sw = self.descriptor(
~~~~~~~~~~~~~~~ <--- HERE
extended_coord,
extended_atype,
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/descriptor/dpa2.py", line 652, in forward
g1 = g1_ext
# repformer
g1, g2, h2, rot_mat, sw = self.repformers(
~~~~~~~~~~~~~~~ <--- HERE
nlist_dict[
get_multiple_nlist_key(
File "/home/zhaochenhao/soft/deepmd3.0b3/lib/python3.10/site-packages/deepmd/pt/model/descriptor/repformers.py", line 480, in forward
assert "recv_num" in comm_dict
assert "communicator" in comm_dict
ret = torch.ops.deepmd.border_op(
~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
comm_dict["send_list"],
comm_dict["send_proc"],
RuntimeError: Trying to create tensor with negative dimension -1873441304: [-1873441304]
(/home/conda/feedstock_root/build_artifacts/deepmd-kit_1722057353391/work/source/lmp/pair_deepmd.cpp:586)
Last command: run 150000
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
Steps to Reproduce
the lammps in.file is as follow:
label i
variable i loop 2
variable ts equal 0+300*$i
variable ta equal 0+300*$i
shell mkdir dpav1-${ta}
units metal
boundary p p p
atom_style atomic
timestep 0.001
read_data min.data
pair_style deepmd ../dpav1.pth
pair_coeff * * x x x
compute 1 all temp
compute Ek all ke/atom
compute Ep all pe/atom
compute_modify 1 dynamic yes
thermo_style custom step dt time temp ke pe etotal press lx ly lz vol
thermo 100
dump 1 all custom 5000 dpav1-${ta}/dumpthermo.atom.* id type x y z c_Ek c_Ep
velocity all create ${ts} 82765577 rot yes dist gaussian
fix r2 all npt temp ${ta} ${ta} 0.1 iso 0.0 0.0 1.0
fix mc4 all atom/swap 20 5 82765577 ${ts} types 1 2
fix mc5 all atom/swap 20 5 82765577 ${ts} types 1 3
fix mc6 all atom/swap 20 5 82765577 ${ts} types 2 3
run 100000
min_style cg
minimize 1.0e-6 1.0e-7 10000 10000
clear
next i
jump SELF i
Further Information, Files, and Links
No response
The text was updated successfully, but these errors were encountered: