We generate specific a type embedding vector for each atom type so that we can share one descriptor embedding net and one fitting net in total, which decline training complexity largely.
The training input script is similar to that of se_e2_a
, but different by adding the {ref}type_embedding <model/type_embedding>
section.
Usually, when the type embedding approach is not enabled, for a system with multiple chemical species (
Thus, there will be
Similar to the embedding networks, if the type embedding approach is not used, the fitting network parameters are chemical-species-wise, and there are
To reduce the number of NN parameters and improve computing efficiency when there are large numbers of chemical species,
the type embedding
where
In fitting networks, the type embedding is inserted into the input of the fitting networks:
In this way, all chemical species share the same network parameters through the type embedding.1
The {ref}model <model>
defines how the model is constructed, adding a section of type embedding net:
"model": {
"type_map": ["O", "H"],
"type_embedding":{
...
},
"descriptor" :{
...
},
"fitting_net" : {
...
}
}
The model will automatically apply the type embedding approach and generate type embedding vectors. If the type embedding vector is detected, the descriptor and fitting net would take it as a part of the input.
The construction of type embedding net is given by {ref}type_embedding <model/type_embedding>
. An example of {ref}type_embedding <model/type_embedding>
is provided as follows
"type_embedding":{
"neuron": [2, 4, 8],
"resnet_dt": false,
"seed": 1
}
- The {ref}
neuron <model/type_embedding/neuron>
specifies the size of the type embedding net. From left to right the members denote the sizes of each hidden layer from the input end to the output end, respectively. It takes a one-hot vector as input and output dimension equals to the last dimension of the {ref}neuron <model/type_embedding/neuron>
list. If the outer layer is twice the size of the inner layer, then the inner layer is copied and concatenated, then a ResNet architecture is built between them. - If the option {ref}
resnet_dt <model/type_embedding/resnet_dt>
is set totrue
, then a timestep is used in the ResNet. - {ref}
seed <model/type_embedding/seed>
gives the random seed that is used to generate random numbers when initializing the model parameters.
A complete training input script of this example can be found in the directory.
$deepmd_source_dir/examples/water/se_e2_a_tebd/input.json
See here for further explanation of type embedding
.
:::{note} You can't apply the compression method while using the atom type embedding. :::
Footnotes
-
This section is built upon Jinzhe Zeng, Duo Zhang, Denghui Lu, Pinghui Mo, Zeyu Li, Yixiao Chen, Marián Rynik, Li'ang Huang, Ziyao Li, Shaochen Shi, Yingze Wang, Haotian Ye, Ping Tuo, Jiabin Yang, Ye Ding, Yifan Li, Davide Tisi, Qiyu Zeng, Han Bao, Yu Xia, Jiameng Huang, Koki Muraoka, Yibo Wang, Junhan Chang, Fengbo Yuan, Sigbjørn Løland Bore, Chun Cai, Yinnian Lin, Bo Wang, Jiayan Xu, Jia-Xin Zhu, Chenxing Luo, Yuzhi Zhang, Rhys E. A. Goodall, Wenshuo Liang, Anurag Kumar Singh, Sikai Yao, Jingchao Zhang, Renata Wentzcovitch, Jiequn Han, Jie Liu, Weile Jia, Darrin M. York, Weinan E, Roberto Car, Linfeng Zhang, Han Wang, J. Chem. Phys. 159, 054801 (2023) licensed under a Creative Commons Attribution (CC BY) license. ↩