MolVS is a molecule validation and standardization tool, written in Python using the RDKit chemistry framework.
Building a collection of chemical structures from different sources can be difficult due to differing representations, drawing conventions and mistakes. MolVS can standardize chemical structures to improve data quality, help with de-duplication and identify relationships between molecules.
There are sensible defaults that make it easy to get started:
>>> from molvs import standardize_smiles >>> standardize_smiles('[Na]OC(=O)c1ccc(C[S+2]([O-])([O-]))cc1') '[Na+].O=C([O-])c1ccc(CS(=O)=O)cc1'
To install MolVS with Anaconda Python, simply run:
conda install -c conda-forge molvs
Alternatively, try one of the other installation options.
Full documentation is available at https://molvs.readthedocs.io.
- Feature ideas and bug reports are welcome on the Issue Tracker.
- Fork the source code on GitHub, make changes and send a pull request.
MolVS is licensed under the MIT license.
There are a number of projects with similar goals that take differing approaches: