MMPolymer is a multimodal multitask pretraining framework that incorporates both 1D sequential and 3D structural information into polymer property prediction
The overview of our proposed MMPolymer
You can try MMPolymer online by clicking on this link
- The code has been tested in the following environment
Package Version Python 3.8.13 PyTorch 1.11.0 CUDA 11.3.1 RDKit 2022.9.5 - Please install via the yaml file
conda env create -f env.yml conda activate MMPolymer
The origin data have beed placed in the fold ./dataset/data
, and please further process these data as follows
cd dataset
python pretrain_data_process.py
python finetune_data_process.py
Please download the checkpoint and place it to the fold ./ckpt
bash train.sh
bash inference.sh
After training, you can use following scripts for actual application
- Take psmiles (e.g., *CC(*)C) as input and predict all properties
python get_prediction_results.py --input_data '*CC(*)C'
- Take a csv file as input and predict all properties
python get_prediction_results.py --input_data $CSV_FILE_PATH
- If you just want to predict a specific property (e.g., Eat)
python get_prediction_results.py --input_data '*CC(*)C' --property Eat python get_prediction_results.py --input_data $CSV_FILE_PATH --property Eat
If this work can help you, please cite it
@inproceedings{wang2024mmpolymer,
title={MMPolymer: A Multimodal Multitask Pretraining Framework for Polymer Property Prediction},
author={Wang, Fanmeng and Guo, Wentao and Cheng, Minjie and Yuan, Shen and Xu, Hongteng and Gao, Zhifeng},
booktitle={Proceedings of the 33rd ACM International Conference on Information and Knowledge Management},
pages={2336--2346},
year={2024}
}
This code is built upon Uni-Mol and Uni-Core. Thanks for their contribution.