Skip to content

AIRI-Institute/doped_CsPbI3_energetics2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cd/Zn- and Br-doped CsPbI3 Energetics: DFT-derived Properties and GNN-based Predictions

Lead halide perovskites are well-known family of functional materials for optoelectronic applications. The γ-phase of CsPbI3 retains favorable optoelectronic characteristics, such as direct bandgap and high charge-carrier mobility. However, any large-scale applications of CsPbI3 face difficulties caused by its polymorphic transitions into the undesirable δ-CsPbI3 phase possessing no useful properties. One of many methods for stabilization of the γ-phase is partial substitution of Pb2+ by Cd2+/Zn2+ and I- by Br-. Such chemical modifications lead to a dramatic increase in the complexity of the corresponding compositional/configurational space (CCS) from computational/predictive perspectives. Due to the size of such space, the application of density functional theory (DFT) calculations for thermodynamic properties assessment is accompanied by modern data-driven solutions, e.g. those based on graph neural network (GNN) architectures.

If you are using this dataset in your research paper, please cite us as

@article{none,
      title={none}, 
      author={none},
      year={2025},
      eprint={none},
      archivePrefix={none},
      primaryClass={none},
      url={none}, 
}

graphical abstract

Dataset

The dataset contains Cd/Zn- and Br-doped CsPbI3 systems in two polymorphic modifications and predictions of their formation energies made using two GNN-models Allegro trained on the DFT derived properties. We considered limiting cases of the Allegro models fine-tuned on the PLS and PHS datasets due to monotonic increasing of the test RMSEs by increasing of the fraction of high-symmetry structures in training data. For each combination of metal dopant (Cd and Zn) and material phase (black γ- and yellow δ-CsPbI3), listed in the table below, we created distinct CCS.

CsPbI3 phase Cd dopant Zn dopant
Black CCS_black_Cd CCS_black_Zn
Yellow CCS_yellow_Cd CCS_yellow_Zn
Thus, each presented pandas dataframe contains crystal structure in CIF format, metainformation columns, DFT-calculated energies, subsample indicators and GNN predictions. Dataframe columns "Formula", "Atomic_numbers", "Cell", "Pos", "Relaxed_*", "Formation_energy_pa" contain data for DFT-calculated structures only, namely, 545 and 1006 values for CCS_black_* and CCS_yellow_*, correspondingly. Missing values in mentioned columns are marked by NaN. More detailed description you can find in the table below.
Column tag Content description
CIF_data Data representing crystal structure in CIF format
CIF_filename Structure name (unique within all dataframes)
Phase Black/yellow (corresponds to the phase studied)
Dopant Cd/Zn (dopant type in the structure)
Dopant_content Dopant content (in mol. %)
Br_content Br content (in mol. %)
Space_group_no Space group number of the doped structure before relaxation
Space_group_symbol Space group symbol of the doped structure before relaxation (Hermann-Mauguin notation)
Weight Corresponds to the number of symmetrically equivalent structures within combinatorial composition/configuration space
Pb_4b_position_substitution Amount of substituted Pb atoms at Wyckoff site 4b in a supercell (relevant for black phase only)
I_4c_position_substitution Amount of substituted I atoms at Wyckoff site 4c in a supercell (relevant for black phase only)
I_8d_position_substitution Amount of substituted I atoms at Wyckoff site 8d in a supercell (relevant for black phase only)
Pb_4c_position_substitution Amount of substituted Pb atoms at Wyckoff site 4c in a supercell (relevant for yellow phase only)
I1_4c_position_substitution Amount of substituted I atoms at Wyckoff site 4c (type 1) in a supercell (relevant for yellow phase only)
I2_4c_position_substitution Amount of substituted I atoms at Wyckoff site 4c (type 2) in a supercell (relevant for yellow phase only)
I3_4c_position_substitution Amount of substituted I atoms at Wyckoff site 4c (type 3) in a supercell (relevant for yellow phase only)
Formula Chemical brutto formula of the structure in format {'chemical element symbol': its amount in a supercell}
Atomic_numbers Atomic numbers of the chemical elements in the structure
Cell Model basis vectors (in angstroms) before relaxation - constant feature for a certain phase
Pos Atomic positions (in angstroms) before relaxation (sequence coincides with that of Atomic_numbers)
Relaxed_cell Model basis vectors (in angstroms) after DFT relaxation
Relaxed_pos Atomic positions (in angstroms) after DFT relaxation (sequence coincides with that of Atomic_numbers)
Relaxed_pressure Pressure (in kbar) for the DFT-relaxed structure
Relaxed_forces Atomic forces (in eV/angstrom) for the DFT-relaxed structure (sequence coincides with that of Atomic_numbers)
Relaxed_energy Relaxed energy per cell (in eV) for the DFT-relaxed structure
Relaxed_energy_pa Relaxed energy per atom (in eV/atom) for the DFT-relaxed structure
Formation_energy_pa Formation energy per atom (in eV/atom) for the DFT-relaxed structure
PHS_train Boolean flag showing whether the structure is in the PHS training subset
PHS_val Boolean flag showing whether the structure is in the PHS validation subset
PHS_test Boolean flag showing whether the structure is in the PHS test subset
PLS_train Boolean flag showing whether the structure is in the PLS training subset
PLS_val Boolean flag showing whether the structure is in the PLS validation subset
PLS_test Boolean flag showing whether the structure is in the PLS test subset
PHS_model_fepa_prediction Structure formation energy per atom (in eV/atom) predicted by GNN-model trained on the PHS training subset
PLS_model_fepa_prediction Structure formation energy per atom (in eV/atom) predicted by GNN-model trained on the PLS training subset

Scripts

The repository also contains a data_processing.py file with visualization functions and usage examples. You can visualize statistics on CCSs, energy distributions, group-subgroup relations.

Model

Learning Local Equivariant Representations for Large-Scale Atomistic Dynamics (Allegro)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages