Code version 1

Author : Benoît BAILLIF email : [email protected]

Objective

This folder contains code related to the Frontiers in Chemistry publication : Exploring the Use of Compound-Induced Transcriptomic Data Generated From Cell Lines to Predict Compound Activity Toward Molecular Targets

The goal of this code is to preprocess data coming from the LINCS (CMap/L1000) and Pubchem (meta)data and to produce the figures, tables and most importantly models presented in the publication.

Sources

GEO pages:
GSE70138
GSE92742
Pubchem using an available Bioassay SQLite extract along with corresponding R package for data extraction
LINCS data portal: to find additional Pubchem CID of profiled compounds ; used links are "outdated" and cannot be found currently
Broad Institute Drug Repurposing Hub: to find TUBB active compounds that are not in Pubchem

Script order

Scripts were written using Jupyter Notebook from conda 4.8.3, with Python 3.7.6

download_raw_data.ipynb To download the required sources
perturbagen_and_related_signatures_metadata_processing.ipynb Compile the 2 GSE metadata Select compound perturbagens Find used compounds, meaning compounds having a 10 µM and 24 h signature in the 8 chosen core cell lines
pubchem_cid_extraction Find all available Pubchem CID for used compounds in the analysis
target_data_processing Produce the final activity matrix to be used downstream
pubchem_bioactivity_matrix_extraction.R Compute the bioactivity matrix using the bioassayR package along with the pubchem protein only SQLite file
signature_extraction.ipynb Extract signatures of used compounds from the gctx archives
morgan_fingerprints_and_signatures_tsne Compute t-SNE embeddings for used compounds and signature, to later plot the chemical and biological spaces
produce_space_plots Produce figures corresponding to chemical and biological space plots
TODO models Compute random forest models, store performances in csv files
TODO distance plots Produce quadrant plots and statistics for the modeled targets

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
download_raw_data.ipynb		download_raw_data.ipynb
morgan_fingerprints_and_signatures_tsne.ipynb		morgan_fingerprints_and_signatures_tsne.ipynb
perturbagen_and_related_signatures_metadata_processing.ipynb		perturbagen_and_related_signatures_metadata_processing.ipynb
produce_space_plots.ipynb		produce_space_plots.ipynb
pubchem_bioactivity_matrix_extraction.R		pubchem_bioactivity_matrix_extraction.R
pubchem_cid_extraction.ipynb		pubchem_cid_extraction.ipynb
quadrant_plots.ipynb		quadrant_plots.ipynb
random_forest_models_new-Copy1.ipynb		random_forest_models_new-Copy1.ipynb
signature_extraction.ipynb		signature_extraction.ipynb
target_data_processing.ipynb		target_data_processing.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code version 1

Objective

Sources

Script order

About

Releases

Packages

Languages

bbaillif/target_prediction_L1000_signatures

Folders and files

Latest commit

History

Repository files navigation

Code version 1

Objective

Sources

Script order

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages