GTM_WAE is a Python package of the Wasserstein Autoencoder (WAE) with attention layers in encoder and a collection of notebooks adapted for the use with non-linear dimensionality reduction method - Generative Topographic Mapping (GTM). It uses the map built on WAE latent vectors to visualize complex multidimensional latent space in 2D, making it easily explorable by human eye. The maps serve as guides to select zones to sample latent vectors that would be decoded to peptides with desired properties with high probability.
- 🗺️ Peptide Space Visualization: Visualize the latent space in a form of 2D maps easily interpretable by human eye.
- 🔬 Property analysis: Colour the maps according to any property and locate the clusters of peptides with particular properties.
- 📊 Motif analysis: Identify predominant peptide motifs important for a property presence in a peptide cluster.
- 🚀 Explainable de novo generation: Use map zones populated with peptides with desired properties for the de novo generation of analogues.
- 💊 Multiple properties constrained generation: Colour maps according to various properties (e.g., activity, cytotoxicity, etc.) to perform multi-property constrained generation.
- 🔍 Library comparison: Compare different libraries or databases to analyze their diversity and coverage.
The publicly available data used for WAE training and GTM creation is available on Hugging Face Hub 🤗: Peptide data for GTM_WAE
Before installing GTM_WAE, ensure that your system has Python installed, with a version less than 3.12. For managing Python environments and dependencies, it is recommended to use Conda or Miniforge.
Create and activate a Conda environment:
conda create -n gtm_wae_env python=3.10 -c conda-forge
conda activate gtm_wae_env
With git:
git clone https://github.com/Laboratoire-de-Chemoinformatique/GTM_WAE.git
Install GTM_WAE using pip after activating your environment:
cd GTM_WAE/
pip install -e .
To use GTM_WAE within Jupyter notebooks, you'll need to add your Conda or virtual environment as a kernel:
python -m ipykernel install --user --name=gtm_wae_env --display-name="GTM_WAE"
This will allow you to select this environment inside Jupyter as a kernel
To update GTM_WAE to the latest version:
-
Go to the folder where GTM_WAE was cloned:
cd GTM_WAE/
-
Pull the new version with git:
git pull
You will need to specify your login and access token.
If you did not install GTM_WAE with the -e
option, you would also need to manually update it in your environment:
-
Activate your environment:
conda activate gtm_wae_env
-
Install the package:
pip install .
Developers should install dependencies via Poetry, which can also be managed through Conda:
conda install poetry -c conda-forge
poetry install
If you encounter any issues with Poetry related to environment variables, add the following line to your ~/.bashrc
file:
echo 'export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring' >> ~/.bashrc
exec bash
- Karina Pikalyova - Ph.D. student in the laboratory of chemoinformatics ([email protected])
- Dr. Tagir Akhmetshin - Postdoctoral researcher in the laboratory of chemoinformatics ([email protected])
- Dr. Alexey Orlov - Postdoctoral researcher in the laboratory of chemoinformatics ([email protected])
- Prof. Alexandre Varnek - Director of laboratory of chemoinformatics ([email protected])