diff --git a/Markdowns/TMT_phospho.Rmd b/Markdowns/TMT_phospho.Rmd index b569186..856f9c7 100644 --- a/Markdowns/TMT_phospho.Rmd +++ b/Markdowns/TMT_phospho.Rmd @@ -10,15 +10,16 @@ output: bibliography: bib.json --- -```{r, message=FALSE, warning=FALSE} +```{r, message=FALSE, warning=FALSE, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` -### Preamble +### Load dependencies +Load the required libraries. ```{r, message=FALSE} library(camprotR) @@ -29,13 +30,47 @@ library(dplyr) library(here) ``` +### Preamble + +The study of phosphorylation by MS-proteomics typically involves enrichment of phosphopeptides using titanium dioxide, since phospho-peptides are relatively low abundance and missed without enrichment. + +Since phospho-peptides inform us about the phosphorylation status of a single amino acid, or multiple amino acides within a peptide, we don't want to summarise the phosphoproteomics data to the protein level. Rather, we want to perform a phosphosite-orientated analysis. This is significantly different to standard quantitative proteomics and requires an alternative data processing workflow, which is presented here. + +Interpretation of the changes in phosphorylation are aided by quantification of the total protein abundance, since we are usually interested in the change in the proportion of the protein which is phosphorylated. There are many ways to design a phosphoproteomics experiment. The ideal experimental design is for the samples to be labelled with TMT and for the phospho-enrichment to be performed on the pooled TMT-labelled samples, leaving some of the TMT-lablled pool for the total protein abundance quantification. This approach avoids the technical variance that arises with separate phospho-enrichment for each sample and also ensures the total and phospho samples originate from the same material. The limitation of this approach is that the phospho-enrichment input is limited to the amount of material which can be TMT labelled. + + ### Input data -We start by reading in a file containing PSM-level output from Proteome Discoverer (PD). This data comes from a published benchmark experiment where yeast peptides were spiked into human peptides at 3 known amounts to provide ground truth fold changes (see below). For more details, see [@http://zotero.org/users/5634351/items/LG3W8G4T] +We start by reading in files containing PSM-level output from Proteome Discoverer (PD). + +The total data comes from a published benchmark experiment where yeast peptides were spiked into human peptides at 3 known amounts to provide ground truth fold changes (see below). For more details, see [@http://zotero.org/users/5634351/items/LG3W8G4T] -The data we will use is available through the `Proteomics.analysis.data` package. +Alongside the published total data shown above, we also performed a yeast/human mix at different ratios, from which phosphopeptides were enriched, to simulate an experiment with changes in phoshorylation. The spike in ratios for the total and phosphopeptide TMT plexes are shown below. + + +|Tag | Total | Phopsho | Ratio | +|------|-------|---------|-------| +| 126 |1x |1x |1 | +| 127N |1x |1x |1 | +| 127C |1x |1x |1 | +| 128N |1x |1x |1 | +| 128C |2x |6x |6 | +| 129N |2x |6x |6 | +| 129C |2x |6x |6 | +| 130N |6x |6x |1 | +| 130C |6x |6x |1 | +| 131 |6x |6x |1 | + +From the above, we can see that if we compare tags 126-128N (total 1x; phospho 1x) to tags 128C-129C (total 2x; phospho 6x), we will have a 2-fold increase total protein, but an 6-fold increase in phosphorylation. Whereas, comparing tags 126-128N to 130N-131, we have an identical 6-fold increase in total protein and phosphoprotein. + +Note, this experimental design is not quite the same as a phospho-enrichment from the same TMT plex used for the total proteomics, since we needed to spike in at different ratios for the two TMT plexes. Nonetheless, given each sample was simply a defined a spike in between two sample, we expect this data to closely simulate an experimental design with phospho-enrichment from the total TMT plex. + + +## Processing and QC of Phosphoproteomics data + +The data from the phospho-enrichment of the phospho TMT plex is unpublished, but both the total and phospho data is is available through the `Proteomics.analysis.data` package. ```{r} psm_total_data <- read.delim( @@ -47,7 +82,9 @@ psm_phospho_data <- read.delim( package = "Proteomics.analysis.data")) ``` -The first step is to remove contaminant proteins. These were defined using the cRAP database. Below, we parse the cRAP fasta to extract the IDs for the cRAP proteins, in both 'cRAP' format and Uniprot IDs for these proteins. +### Parse PeptideGroups.txt files +To simply the process of reading in the data and performing initial filtering, we will use `camprotR::parse_features`. This function will read in the data and remove contaminant proteins and features without quantification data. Contaminant proteins were defined using the [cRAP database](https://www.thegpm.org/crap/) and provided to PD. We need to obtain their accessions and provide these to `camprotR::parse_features`. Below, we parse the cRAP FASTA to extract the IDs for the cRAP proteins, in both 'cRAP' format and Uniprot IDs for these proteins. + ```{r} crap_fasta_inf <- system.file( @@ -116,13 +153,18 @@ We can also add the sequence around the phosphosite, which can be useful to e.g psm_phospho_data_parsed <- add_site_sequence(psm_phospho_data_parsed, protein_fasta_inf) ``` -Finally, we need to add the positions of the peptide for the total peptides, so we can intersect the positions of the phosphosite and total peptides for the statistical testing. We can use `add_peptide_positions` for this, though note that this code chunk takes a long time to run (the function needs to be optimised). +Finally, we need to add the positions of the peptide for the total peptides, so we can intersect the positions of the phosphosite and total peptides for the statistical testing. We can use `add_peptide_positions` for this, though note that this code chunk will take around 1 minute time to run - the function could likely be further optimised. ```{r} + psm_total_data_flt <- add_peptide_positions(psm_total_data_flt, proteome_fasta=protein_fasta_inf) ``` -We now store the filtered PSM data for total and phospho in MSnSets, the standard data object for proteomics in R. See the `vignette("msnset", package="camprotR")` for more details. +### Creating an `MSnSet` + +We now store the filtered PSM data for total and phospho peptides in MSnSets, the standard data object for proteomics in R. See the `vignette("msnset", package="camprotR")` for more details.For this, we need to generate a matrix of PSM-level abundances, where rows are features (PSMs), and a separate data.frame with information about the features, e.g master protein assignment. See the `vignette("msnset", package="camprotR")` for more details about MSnSets. + +We need to create two separate MSnSets for the total and phospho peptides. In both cases, we will manually define the experimental conditions. ```{r} # Abundance columns for TMT PD-output start with Abundance @@ -135,9 +177,7 @@ psm.total.f <- psm_total_data_flt[, setdiff(colnames(psm_total_data_flt), abunda # update the column names to remove the 'Abundance.` prefix colnames(psm.total.e) <- gsub('Abundance.', '', colnames(psm.total.e)) -# we don't have 'phenotype' data to add so we just define the -# 'expression' data and 'feature' data - +# Manually define the 'phenotype' data (experimental details) psm.total.p <- data.frame(spike=rep(factor(c('x1', 'x2', 'x6')), c(4,3,3)), condition=rep(1:3, c(4,3,3)), row.names=colnames(psm.total.e)) @@ -157,9 +197,7 @@ psm.phospho.f <- psm_phospho_data_parsed[, setdiff(colnames(psm_phospho_data_par # update the column names to remove the 'Abundance.` prefix colnames(psm.phospho.e) <- gsub('Abundance.', '', colnames(psm.phospho.e)) -# we don't have 'phenotype' data to add so we just define the -# 'expression' data and 'feature' data - +# Manually define the 'phenotype' data (experimental details) psm.phospho.p <- data.frame(spike=rep(factor(c('x1', 'x6')), c(4,6)), condition=rep(1:3, c(4,3,3)), row.names=colnames(psm.phospho.e)) @@ -168,10 +206,14 @@ psm.phospho <- MSnbase::MSnSet(exprs = psm.phospho.e, fData = psm.phospho.f, pDa ``` +To simply the downstream plotting and filtering, we create a list containg the two MSnSets. + ```{r} psm_res <- list('Total'=psm.total, 'Phospho'=psm.phospho) ``` + +Below, we assess the quantification distributions. There are no clear outlier samples or any other concerns. ```{r} for(x in names(psm_res)){ @@ -190,30 +232,28 @@ for(x in names(psm_res)){ } ``` -Note that where the signal:noise > 10, there are far fewer missing values. -```{r} -for(x in psm_res){ - plotNA(x) -} -``` +### Removing low quality PSMs +We want to remove low Signal:Noise (S:N) PSMs, since the quantification values will be less accurate and there will be more missing values. We can inspect the relationship between S:N and missing values using the `plot_missing_SN` function. + +We use Note that where the signal:noise > 10, there are far fewer missing values. ```{r, fig.height=5, fig.width=7, fig.fullwidth=TRUE, fig.cap="Missing values per PSM, in relation to the signal:noise ratio"} for(x in names(psm_res)){ - p <- psm_res[[x]] %>% plot_missing_SN(bins=50) + + p <- psm_res[[x]] %>% plot_missing_SN(bins=40) + ggtitle(x) print(p) } ``` -We can also look into this relationship at the tag level using `plot_missing_SN_per_sample`. In this case, there is no tag which appears to have a high proportion of missing values when signal:noise > 5. If there were, this may warrant further exploration, e.g was one of the sample preparations inadequate such that fewer peptides were labeled? +We can also look into this relationship at the tag level using `plot_missing_SN_per_sample`. In this case, there is no tag which appears to have a high proportion of missing values when signal:noise > 10. If there were, this may warrant further exploration, e.g was one of the sample preparations inadequate such that fewer peptides were labeled? ```{r} for(x in names(psm_res)){ - p <- psm_res[[x]] %>% plot_missing_SN_per_sample(bins=50) + + p <- psm_res[[x]] %>% plot_missing_SN_per_sample(bins=40) + ggtitle(x) print(p) } @@ -221,15 +261,16 @@ for(x in names(psm_res)){ ``` -Based on the above, we will filter the PSMs to only retain those with S:N > 10 using `filter_TMT_PSMs`. Using the same function, we will also remove PSMs with interference/co-isolation >10%. +Based on the above, we will filter the PSMs to only retain those with S:N > 10 using `filter_TMT_PSMs`. Using the same function, we will also remove PSMs with interference/co-isolation >10%, since the mixed species design means our abundances are particularly vulnerable to issues with co-isolation. ```{r} psm_filt_sn_int <- psm_res %>% lapply(function(x){ filter_TMT_PSMs(x, inter_thresh = 10, sn_thresh = 5)}) ``` -Since we have merged together material from two species, we will perform a couple more -Finally, we will also filter our PSMs using the Delta.Score, e.g the difference between the score for the top two hits. Where this is + +Below, we will also filter our PSMs using the Delta.Score, e.g the difference between the score for the top two hits. Where this is the difference between the top score in the MS2 spectrum matching and the second top score. For a mixed species sample, it's likely some peptides will be difficult to assign to the protein from the correct species, so we are being extra cautious here. For a typical phospho TMT proteomics experiment, it would likely be reasonable to use a more relaxed filter above for the interference (e.g 50%) and to not filter by Delta Score at all. + ```{r} psm_filt_sn_int_delta <- psm_filt_sn_int %>% lapply(function(x){ @@ -242,8 +283,8 @@ psm_filt_sn_int_delta <- psm_filt_sn_int %>% lapply(function(x){ ``` -For a typical phospho TMT proteomics experiment, it would likely be reasonable to use a more relaxed filter for the interference (e.g 50%) and to not filter by Delta Score at all. +Finally, we summarise our phosphopeptide quantification values, using the combination of the protein ID and the PTM position(s) as the key. This means that if two peptides cover the same phospho site(s) due to missed cleavage, they will be summarised into a single phospho_site quantification. ```{r} phospho_sites <- psm_filt_sn_int_delta$Phospho %>% MSnbase::combineFeatures( @@ -256,14 +297,15 @@ print(nrow(phospho_sites)) ``` Note that our filtering has removed a lot of PSMs, especially for the phosphosites. We've gone from: -- 5582 PSMs in the input file -- 5335 PSMs after removing contaminants and PSMs without a unique master protein -- 4301 PSMs after removing those without a phosphorylation site -- 2852 PSMs after removing those with a phosphoRS score < 75 -- 1524 PSMs after removing those with interference > 10 or Signal:Noise < 10 -- 1008 PSMs after removing those with Delta Score < 0.2. -- 666 phosphosites after combining PSMs for the same phosphosite +- `r nrow(psm_phospho_data)` PSMs in the input file +- `r nrow(psm_total_data_flt)` PSMs after removing contaminants and PSMs without a unique master protein +- `r nrow(psm_phospho_data_parsed)` PSMs after removing those without a phosphorylation site, a phosphoRS score < 75, interference > 10 %, Signal:Noise < 10 or Delta Score < 0.2. +- `r nrow(phospho_sites)` after combining PSMs for the same phosphosite +As stated previously, the stringent interference threshold and the use of a delta score threshold are because of the mixed species design. In a more typical experiment, a 50% interference threshold may be suitable and a Delta score threshold would likely not be required. + + +We also need to summarise our total peptide quantification data. As we shall see in the statistical testing notebook, we shall perform further summarisation for the total peptides, but this is better performed later, for reasons which will become clear. ```{r} total <- psm_filt_sn_int_delta$Total %>% MSnbase::combineFeatures( @@ -272,7 +314,7 @@ total <- psm_filt_sn_int_delta$Total %>% ``` -We save the object to disk so we can read it back into memory when we need it +Finally, we save the object to disk so we can read it back into memory when we need it ```{r, eval=FALSE} saveRDS(psm_filt_sn_int, here('results/benchmark_tmt_phospho_psm_filt.rds')) @@ -280,3 +322,7 @@ saveRDS(total, here('results/total.rds')) saveRDS(phospho_sites, here('results/phospho_sites.rds')) ``` + + +For an example how to perform statistical testing for changes in phosphorylation, see +[Intersecting phosphosites and total peptides and statistical testing](https://mrctoxbioinformatics.github.io/Proteomics_data_analysis/Markdowns/TMT_phospho_stats.html) diff --git a/Markdowns/TMT_phospho.html b/Markdowns/TMT_phospho.html index 5c04a4c..f0e414e 100644 --- a/Markdowns/TMT_phospho.html +++ b/Markdowns/TMT_phospho.html @@ -11,7 +11,7 @@ - + Phosphoproteomics using Tandem Mass Tags @@ -230,6 +230,8 @@ // select all R code blocks var rCodeBlocks = $('pre.r, pre.python, pre.bash, pre.sql, pre.cpp, pre.stan, pre.julia, pre.foldable'); rCodeBlocks.each(function() { + // skip if the block has fold-none class + if ($(this).hasClass('fold-none')) return; // create a collapsable div to wrap the code in var div = $('
'); @@ -240,7 +242,7 @@ $(this).detach().appendTo(div); // add a show code button right above - var showCodeText = $('' + (showThis ? 'Hide' : 'Code') + ''); + var showCodeText = $('' + (showThis ? 'Hide' : 'Show') + ''); var showCodeButton = $(''); showCodeButton.append(showCodeText); showCodeButton @@ -266,7 +268,7 @@ // * Change text // * add a class for intermediate states styling div.on('hide.bs.collapse', function () { - showCodeText.text('Code'); + showCodeText.text('Show'); showCodeButton.addClass('btn-collapsing'); }); div.on('hidden.bs.collapse', function () { @@ -396,23 +398,17 @@ border: 1px solid #ddd; border-radius: 4px; } -.tabset-dropdown > .nav-tabs > li.active:before { -content: "๎‰™"; +.tabset-dropdown > .nav-tabs > li.active:before, .tabset-dropdown > .nav-tabs.nav-tabs-open:before { +content: "\e259"; font-family: 'Glyphicons Halflings'; display: inline-block; padding: 10px; border-right: 1px solid #ddd; } .tabset-dropdown > .nav-tabs.nav-tabs-open > li.active:before { -content: "๎‰˜"; -border: none; -} -.tabset-dropdown > .nav-tabs.nav-tabs-open:before { -content: "๎‰™"; +content: "\e258"; font-family: 'Glyphicons Halflings'; -display: inline-block; -padding: 10px; -border-right: 1px solid #ddd; +border: none; } .tabset-dropdown > .nav-tabs > li.active { display: block; @@ -469,40 +465,150 @@

Phosphoproteomics using Tandem Mass

QC PSM-level quantification, filtering and summarisation to protein-level abundance

Tom Smith

-

2023-02-06

+

2024-07-01

-
knitr::opts_chunk$set(
-  collapse = TRUE,
-  comment = "#>"
-)
-
-

Preamble

+
+

Load dependencies

+

Load the required libraries.

library(camprotR)
-#> Warning in fun(libname, pkgname): mzR has been built against a different Rcpp version (1.0.7)
-#> than is installed on your system (1.0.8.3). This might lead to errors
-#> when loading mzR. If you encounter such issues, please send a report,
-#> including the output of sessionInfo() to the Bioc support forum at 
-#> https://support.bioconductor.org/. For details see also
-#> https://github.com/sneumann/mzR/wiki/mzR-Rcpp-compiler-linker-issue.
 library(MSnbase)
 library(ggplot2)
 library(tidyr)
 library(dplyr)
 library(here)
+
+

Preamble

+

The study of phosphorylation by MS-proteomics typically involves +enrichment of phosphopeptides using titanium dioxide, since +phospho-peptides are relatively low abundance and missed without +enrichment.

+

Since phospho-peptides inform us about the phosphorylation status of +a single amino acid, or multiple amino acides within a peptide, we donโ€™t +want to summarise the phosphoproteomics data to the protein level. +Rather, we want to perform a phosphosite-orientated analysis. This is +significantly different to standard quantitative proteomics and requires +an alternative data processing workflow, which is presented here.

+

Interpretation of the changes in phosphorylation are aided by +quantification of the total protein abundance, since we are usually +interested in the change in the proportion of the protein which is +phosphorylated. There are many ways to design a phosphoproteomics +experiment. The ideal experimental design is for the samples to be +labelled with TMT and for the phospho-enrichment to be performed on the +pooled TMT-labelled samples, leaving some of the TMT-lablled pool for +the total protein abundance quantification. This approach avoids the +technical variance that arises with separate phospho-enrichment for each +sample and also ensures the total and phospho samples originate from the +same material. The limitation of this approach is that the +phospho-enrichment input is limited to the amount of material which can +be TMT labelled.

+

Input data

-

We start by reading in a file containing PSM-level output from -Proteome Discoverer (PD). This data comes from a published benchmark -experiment where yeast peptides were spiked into human peptides at 3 -known amounts to provide ground truth fold changes (see below). For more -details, see (Smith et al. 2022)

+

We start by reading in files containing PSM-level output from +Proteome Discoverer (PD).

+

The total data comes from a published benchmark experiment where +yeast peptides were spiked into human peptides at 3 known amounts to +provide ground truth fold changes (see below). For more details, see +(Smith et al. 2022)

-

The data we will use is available through the -Proteomics.analysis.data package.

+

Alongside the published total data shown above, we also performed a +yeast/human mix at different ratios, from which phosphopeptides were +enriched, to simulate an experiment with changes in phoshorylation. The +spike in ratios for the total and phosphopeptide TMT plexes are shown +below.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TagTotalPhopshoRatio
1261x1x1
127N1x1x1
127C1x1x1
128N1x1x1
128C2x6x6
129N2x6x6
129C2x6x6
130N6x6x1
130C6x6x1
1316x6x1
+

From the above, we can see that if we compare tags 126-128N (total +1x; phospho 1x) to tags 128C-129C (total 2x; phospho 6x), we will have a +2-fold increase total protein, but an 6-fold increase in +phosphorylation. Whereas, comparing tags 126-128N to 130N-131, we have +an identical 6-fold increase in total protein and phosphoprotein.

+

Note, this experimental design is not quite the same as a +phospho-enrichment from the same TMT plex used for the total proteomics, +since we needed to spike in at different ratios for the two TMT plexes. +Nonetheless, given each sample was simply a defined a spike in between +two sample, we expect this data to closely simulate an experimental +design with phospho-enrichment from the total TMT plex.

+
+
+

Processing and QC of Phosphoproteomics data

+

The data from the phospho-enrichment of the phospho TMT plex is +unpublished, but both the total and phospho data is is available through +the Proteomics.analysis.data package.

psm_total_data <- read.delim(
   system.file("extdata", 'benchmark_phosphoTMT', 'benchmark_total_TMT_PSMs.txt.gz',
               package = "Proteomics.analysis.data"))
@@ -510,10 +616,17 @@ 

Input data

psm_phospho_data <- read.delim( system.file("extdata", 'benchmark_phosphoTMT', 'benchmark_phospho_TMT_PSMs.txt.gz', package = "Proteomics.analysis.data"))
-

The first step is to remove contaminant proteins. These were defined -using the cRAP database. Below, we parse the cRAP fasta to extract the -IDs for the cRAP proteins, in both โ€˜cRAPโ€™ format and Uniprot IDs for -these proteins.

+
+

Parse PeptideGroups.txt files

+

To simply the process of reading in the data and performing initial +filtering, we will use camprotR::parse_features. This +function will read in the data and remove contaminant proteins and +features without quantification data. Contaminant proteins were defined +using the cRAP database and +provided to PD. We need to obtain their accessions and provide these to +camprotR::parse_features. Below, we parse the cRAP FASTA to +extract the IDs for the cRAP proteins, in both โ€˜cRAPโ€™ format and Uniprot +IDs for these proteins.

crap_fasta_inf <- system.file(
   "extdata", "cRAP_20190401.fasta.gz", 
   package = "Proteomics.analysis.data"
@@ -545,8 +658,8 @@ 

Input data

#> 100728 features found from 10553 master proteins => cRAP features removed #> 100145 features found from 10520 master proteins => associated cRAP features removed #> 100119 features found from 10519 master proteins => features without a master protein removed -#> 95054 features found from 9575 master proteins => features with non-unique master proteins removed - +#> 95054 features found from 9575 master proteins => features with non-unique master proteins removed
+

 psm_phospho_data_flt <- parse_features(
   psm_phospho_data, 
   crap_proteins = crap.accessions, 
@@ -581,8 +694,8 @@ 

Input data

#> monoPTM passing filter: 771 #> biPTM passing filter: 1331 #> multiPTM passing filter: 750 -#> Too many isoforms: 5 - +#> Too many isoforms: 5
+

 # The filtering doesn't actually take place until this step
 psm_phospho_data_parsed <- psm_phospho_data_parsed %>%
   filter(filtered_score!="")
@@ -602,12 +715,24 @@

Input data

peptides, so we can intersect the positions of the phosphosite and total peptides for the statistical testing. We can use add_peptide_positions for this, though note that this code -chunk takes a long time to run (the function needs to be optimised).

-
psm_total_data_flt <- add_peptide_positions(psm_total_data_flt, proteome_fasta=protein_fasta_inf)
-

We now store the filtered PSM data for total and phospho in MSnSets, -the standard data object for proteomics in R. See the -vignette("msnset", package="camprotR") for more -details.

+chunk will take around 1 minute time to run - the function could likely +be further optimised.

+

+psm_total_data_flt <- add_peptide_positions(psm_total_data_flt, proteome_fasta=protein_fasta_inf)
+
+
+

Creating an MSnSet

+

We now store the filtered PSM data for total and phospho peptides in +MSnSets, the standard data object for proteomics in R. See the +vignette("msnset", package="camprotR") for more details.For +this, we need to generate a matrix of PSM-level abundances, where rows +are features (PSMs), and a separate data.frame with information about +the features, e.g master protein assignment. See the +vignette("msnset", package="camprotR") for more details +about MSnSets.

+

We need to create two separate MSnSets for the total and phospho +peptides. In both cases, we will manually define the experimental +conditions.

# Abundance columns for TMT PD-output start with Abundance 
 abundance_total_cols <- colnames(psm_total_data_flt)[
   grepl('Abundance.', colnames(psm_total_data_flt))]
@@ -618,9 +743,7 @@ 

Input data

# update the column names to remove the 'Abundance.` prefix colnames(psm.total.e) <- gsub('Abundance.', '', colnames(psm.total.e)) -# we don't have 'phenotype' data to add so we just define the -# 'expression' data and 'feature' data - +# Manually define the 'phenotype' data (experimental details) psm.total.p <- data.frame(spike=rep(factor(c('x1', 'x2', 'x6')), c(4,3,3)), condition=rep(1:3, c(4,3,3)), row.names=colnames(psm.total.e)) @@ -648,15 +771,17 @@

Input data

# update the column names to remove the 'Abundance.` prefix colnames(psm.phospho.e) <- gsub('Abundance.', '', colnames(psm.phospho.e)) -# we don't have 'phenotype' data to add so we just define the -# 'expression' data and 'feature' data - +# Manually define the 'phenotype' data (experimental details) psm.phospho.p <- data.frame(spike=rep(factor(c('x1', 'x6')), c(4,6)), condition=rep(1:3, c(4,3,3)), row.names=colnames(psm.phospho.e)) psm.phospho <- MSnbase::MSnSet(exprs = psm.phospho.e, fData = psm.phospho.f, pData=psm.phospho.p)
+

To simply the downstream plotting and filtering, we create a list +containg the two MSnSets.

psm_res <- list('Total'=psm.total, 'Phospho'=psm.phospho)
+

Below, we assess the quantification distributions. There are no clear +outlier samples or any other concerns.

for(x in names(psm_res)){
   
   p <- psm_res[[x]] %>% log(base=2) %>% plot_quant() +
@@ -672,35 +797,42 @@ 

Input data

print(p) } -#> Warning: Removed 7078 rows containing non-finite values (`stat_boxplot()`).
+#> Warning: Removed 7078 rows containing non-finite outside the scale range +#> (`stat_boxplot()`).

-
#> Warning: Removed 7078 rows containing non-finite values (`stat_density()`).
-

-
#> Warning: Removed 41 rows containing non-finite values (`stat_boxplot()`).
+
#> Warning: Removed 7078 rows containing non-finite outside the scale range
+#> (`stat_density()`).
+

+
#> Warning: Removed 41 rows containing non-finite outside the scale range
+#> (`stat_boxplot()`).

-
#> Warning: Removed 41 rows containing non-finite values (`stat_density()`).
-

-

Note that where the signal:noise > 10, there are far fewer missing -values.

-
for(x in psm_res){
-  plotNA(x)
-}
-

+
#> Warning: Removed 41 rows containing non-finite outside the scale range
+#> (`stat_density()`).
+

+
+
+

Removing low quality PSMs

+

We want to remove low Signal:Noise (S:N) PSMs, since the +quantification values will be less accurate and there will be more +missing values. We can inspect the relationship between S:N and missing +values using the plot_missing_SN function.

+

We use Note that where the signal:noise > 10, there are far fewer +missing values.

for(x in names(psm_res)){
   
-    p <- psm_res[[x]] %>% plot_missing_SN(bins=50) +
+    p <- psm_res[[x]] %>% plot_missing_SN(bins=40) +
       ggtitle(x)
     print(p)
 
 }
-Missing values per PSM, in relation to the signal:noise ratio +Missing values per PSM, in relation to the signal:noise ratio

Missing values per PSM, in relation to the signal:noise ratio

-Missing values per PSM, in relation to the signal:noise ratio +Missing values per PSM, in relation to the signal:noise ratio

Missing values per PSM, in relation to the signal:noise ratio

@@ -708,20 +840,22 @@

Input data

We can also look into this relationship at the tag level using plot_missing_SN_per_sample. In this case, there is no tag which appears to have a high proportion of missing values when -signal:noise > 5. If there were, this may warrant further +signal:noise > 10. If there were, this may warrant further exploration, e.g was one of the sample preparations inadequate such that fewer peptides were labeled?

for(x in names(psm_res)){
   
 
-    p <- psm_res[[x]] %>% plot_missing_SN_per_sample(bins=50) +
+    p <- psm_res[[x]] %>% plot_missing_SN_per_sample(bins=40) +
       ggtitle(x)
     print(p)
 }
-

+

Based on the above, we will filter the PSMs to only retain those with S:N > 10 using filter_TMT_PSMs. Using the same function, -we will also remove PSMs with interference/co-isolation >10%.

+we will also remove PSMs with interference/co-isolation >10%, since +the mixed species design means our abundances are particularly +vulnerable to issues with co-isolation.

psm_filt_sn_int <- psm_res %>% lapply(function(x){
   filter_TMT_PSMs(x, inter_thresh = 10, sn_thresh = 5)})
 #> Filtering PSMs...
@@ -732,10 +866,15 @@ 

Input data

#> 2850 features found from 953 master proteins => No quant filtering #> 1532 features found from 668 master proteins => Co-isolation filtering #> 1524 features found from 667 master proteins => S:N ratio filtering
-

Since we have merged together material from two species, we will -perform a couple more

-

Finally, we will also filter our PSMs using the Delta.Score, e.g the -difference between the score for the top two hits. Where this is

+

Below, we will also filter our PSMs using the Delta.Score, e.g the +difference between the score for the top two hits. Where this is the +difference between the top score in the MS2 spectrum matching and the +second top score. For a mixed species sample, itโ€™s likely some peptides +will be difficult to assign to the protein from the correct species, so +we are being extra cautious here. For a typical phospho TMT proteomics +experiment, it would likely be reasonable to use a more relaxed filter +above for the interference (e.g 50%) and to not filter by Delta Score at +all.


 psm_filt_sn_int_delta <- psm_filt_sn_int %>% lapply(function(x){
   flt <- x[fData(x)$Delta.Score>0.2,]
@@ -744,9 +883,11 @@ 

Input data

}) #> 45488 features found from 7957 master proteins => Delta Score filtering #> 1008 features found from 471 master proteins => Delta Score filtering
-

For a typical phospho TMT proteomics experiment, it would likely be -reasonable to use a more relaxed filter for the interference (e.g 50%) -and to not filter by Delta Score at all.

+

Finally, we summarise our phosphopeptide quantification values, using +the combination of the protein ID and the PTM position(s) as the key. +This means that if two peptides cover the same phospho site(s) due to +missed cleavage, they will be summarised into a single phospho_site +quantification.

phospho_sites <- psm_filt_sn_int_delta$Phospho %>%
     MSnbase::combineFeatures(
       groupBy = paste(fData(psm_filt_sn_int_delta$Phospho)$Master.Protein.Accessions,
@@ -757,13 +898,20 @@ 

Input data

print(nrow(phospho_sites)) #> [1] 666

Note that our filtering has removed a lot of PSMs, especially for the -phosphosites. Weโ€™ve gone from: - 5582 PSMs in the input file - 5335 PSMs -after removing contaminants and PSMs without a unique master protein - -4301 PSMs after removing those without a phosphorylation site - 2852 -PSMs after removing those with a phosphoRS score < 75 - 1524 PSMs -after removing those with interference > 10 or Signal:Noise < 10 - -1008 PSMs after removing those with Delta Score < 0.2. - 666 -phosphosites after combining PSMs for the same phosphosite

+phosphosites. Weโ€™ve gone from: - 5582 PSMs in the input file - 95054 +PSMs after removing contaminants and PSMs without a unique master +protein - 2852 PSMs after removing those without a phosphorylation site, +a phosphoRS score < 75, interference > 10 %, Signal:Noise < 10 +or Delta Score < 0.2. - 666 after combining PSMs for the same +phosphosite

+

As stated previously, the stringent interference threshold and the +use of a delta score threshold are because of the mixed species design. +In a more typical experiment, a 50% interference threshold may be +suitable and a Delta score threshold would likely not be required.

+

We also need to summarise our total peptide quantification data. As +we shall see in the statistical testing notebook, we shall perform +further summarisation for the total peptides, but this is better +performed later, for reasons which will become clear.

total <- psm_filt_sn_int_delta$Total %>%
     MSnbase::combineFeatures(
       groupBy = fData(psm_filt_sn_int_delta$Total)$Sequence,
@@ -771,12 +919,15 @@ 

Input data

#> Your data contains missing values. Please read the relevant section in #> the combineFeatures manual page for details on the effects of missing #> values on data aggregation.
-

We save the object to disk so we can read it back into memory when we -need it

+

Finally, we save the object to disk so we can read it back into +memory when we need it

saveRDS(psm_filt_sn_int, here('results/benchmark_tmt_phospho_psm_filt.rds'))
 saveRDS(total, here('results/total.rds'))
 saveRDS(phospho_sites, here('results/phospho_sites.rds'))
 
+

For an example how to perform statistical testing for changes in +phosphorylation, see Intersecting +phosphosites and total peptides and statistical testing

Smith, Tom S., Anna Andrejeva, Josie Christopher, Oliver M. Crook, @@ -793,6 +944,7 @@

Input data

+
diff --git a/Markdowns/TMT_phospho_stats.Rmd b/Markdowns/TMT_phospho_stats.Rmd index cca8786..141aa9b 100644 --- a/Markdowns/TMT_phospho_stats.Rmd +++ b/Markdowns/TMT_phospho_stats.Rmd @@ -9,7 +9,12 @@ output: pdf_document: default bibliography: bib.json --- -```{r, message=FALSE, warning=FALSE} + +### Load dependencies + +Load the required libraries. + +```{r, message=FALSE, warning=FALSE, include=FALSE} library(camprotR) library(MSnbase) library(ggplot2) @@ -21,13 +26,23 @@ library(uniprotREST) ``` +### Preamble + +The correct approach for statistical testing for changes in phosphorylation will depend on the details of the experimental design. Here, we have quantified phospho and unmodified peptides, so we can identify changes in phosphorylation which are not explained by changes in the unmodified protein, e.g changes in the proportion of the protein that is phosphorylated. + +To acheive this, we will jointly model the phospho and unmodified peptides which overlap each phosphosite, using a linear model with an interaction term. + +### Input data + +We start by reading in the MSnSets created in the previous notebook [QC PSM-level quantification, filtering and summarisation to protein-level abundance](https://mrctoxbioinformatics.github.io/Proteomics_data_analysis/Markdowns/TMT_phospho.html) + ```{r} total <- readRDS(here('results/total.rds')) phospho_sites <- readRDS(here('results/phospho_sites.rds')) ``` -Below, we identify the total peptides that overlap each phosphosite. Note that each phosphosite may intersect multiple total peptides (due to missed cleavages) and each total peptide may intersect multiple phosphosites (due to multiple phosphosites per peptide). The code chunk takes a while to run.. +Below, we identify the total peptides that overlap each phosphosite. Note that each phosphosite may intersect multiple total peptides (due to missed cleavages) and each total peptide may intersect multiple phosphosites (due to multiple phosphosites per peptide). ```{r} phospho_f <- phospho_sites %>% fData() %>% tibble::rownames_to_column('ID') total_f <- total %>% fData() %>% tibble::rownames_to_column('ID') @@ -85,6 +100,7 @@ colnames(all_combined_exprs) <- c(paste0(colnames(phospho_sites), '_phospho'), paste0(colnames(total), '_total')) ``` + Inspect the combined quantification matrix. ```{r} head(all_combined_exprs) # @@ -108,19 +124,42 @@ pData(all_combined_res) ``` Below, we subset to the first 2 'conditions', where condition 1 = tags 1-4 (x1 phospho; x1 total) and condition 2 = tags 5-7 (x6 phospho; x2 total). -The ground truth for the difference between the first 4 tags and the next 3 is therefore a 3-fold increase in phosphorylation for the yeast proteins and reduced phosphorylation for the human proteins. For the human proteins, given the amounts of spike in yeast and balancing human proteins to make up the labelled material, we expect a 30% drop in phosphorylation. +The ground truth for the difference between condition 2 and condition 1 is therefore a 3-fold increase in phosphorylation for the yeast proteins and reduced phosphorylation for the human proteins. For the human proteins, given the amounts of spike in yeast and balancing human proteins to make up the labelled material, we expect a 30% drop in phosphorylation. ```{r} pairwise_comparison_combined_res <- all_combined_res[,pData(all_combined_res)$condition %in% 1:2] pData(pairwise_comparison_combined_res) ``` -Below, we perform the limma analysis -```{r} -dat <- exprs(pairwise_comparison_combined_res) %>% log(base=2) +### Run limma + + +Below, we perform the limma analysis. First though, some clarification on the model we're using. + +We have two experimental variables: +- type: phospho or total +- condition: 1x or 2x spike in +```{r} type <- factor(pData(pairwise_comparison_combined_res)$type, level=c('total', 'phospho')) condition <- factor(pData(pairwise_comparison_combined_res)$condition, levels=1:2) +print(paste(type, condition)) +``` + +When we use an model with the terms type, condition and type:condition, the interaction term captures the difference between the observed data and the simple additive model when the abundance is dependent upon just the separate effect of the type and condition variables and there is no combinatorial effect. +```{r} +# model without interaction term +print(model.matrix(~type+condition)) + +# model with an interaction term +print(model.matrix(~type+condition+type:condition)) + +``` +Below, we run the linear modeling. For more details about `limma`, see the +[Differential abundance testing for TMT proteomics]('~/git_repos/bioinf_training/Proteomics.data.analysis/Markdowns/Stats_diff_abundance_TMT.Rmd') notebook. + +```{r} +dat <- exprs(pairwise_comparison_combined_res) %>% log(base=2) study.design <- model.matrix(~type*condition) fit <- lmFit(dat, study.design) @@ -132,7 +171,7 @@ limma.results$sigma <- fit$sigma ``` -We would like to add information about the species to make sure the fold-changes are in the expected direction +We would like to add information about the species to make sure the fold-changes are in the expected direction. We'll use the `uniprotREST` package to query ```{r} limma.results$protein <- rownames(limma.results) %>% strsplit(split='_') %>% @@ -147,6 +186,7 @@ species_res <- uniprot_map( "Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast)"="S.cerevisiae", "Homo sapiens (Human)"="H.sapiens")) ``` +Below, we plot the limma results using a volcano plot, with two panels, one for each species. ```{r} limma.results %>% @@ -160,4 +200,8 @@ limma.results %>% ylab('-log10(p-value)') + scale_fill_manual(values=c('grey90', get_cat_palette(2)[2]), name='FDR < 1%') ``` -OK, so the direction of change is as expected (reduced for human, increased for yeast), but only a subset of the fold-changes reach a 1% FDR threshold for significance. + + +The direction of change is as expected (reduced for human, increased for yeast), but only a subset of the fold-changes reach a 1% FDR threshold for significance. The fold-changes are appoximately what we would expect too: human phosphosites should be reduced by 30% = `r round(log2(0.7), 2)` on a log2 scale and the yeast phosphosites should be increased by 3-fold = `r round(log2(3), 2)` on a log2 scale. + + diff --git a/Markdowns/TMT_phospho_stats.html b/Markdowns/TMT_phospho_stats.html index 8254b89..eea907f 100644 --- a/Markdowns/TMT_phospho_stats.html +++ b/Markdowns/TMT_phospho_stats.html @@ -11,7 +11,7 @@ - + Phosphoproteomics using Tandem Mass Tags @@ -230,6 +230,8 @@ // select all R code blocks var rCodeBlocks = $('pre.r, pre.python, pre.bash, pre.sql, pre.cpp, pre.stan, pre.julia, pre.foldable'); rCodeBlocks.each(function() { + // skip if the block has fold-none class + if ($(this).hasClass('fold-none')) return; // create a collapsable div to wrap the code in var div = $('
'); @@ -240,7 +242,7 @@ $(this).detach().appendTo(div); // add a show code button right above - var showCodeText = $('' + (showThis ? 'Hide' : 'Code') + ''); + var showCodeText = $('' + (showThis ? 'Hide' : 'Show') + ''); var showCodeButton = $(''); showCodeButton.append(showCodeText); showCodeButton @@ -266,7 +268,7 @@ // * Change text // * add a class for intermediate states styling div.on('hide.bs.collapse', function () { - showCodeText.text('Code'); + showCodeText.text('Show'); showCodeButton.addClass('btn-collapsing'); }); div.on('hidden.bs.collapse', function () { @@ -374,23 +376,17 @@ border: 1px solid #ddd; border-radius: 4px; } -.tabset-dropdown > .nav-tabs > li.active:before { -content: "๎‰™"; +.tabset-dropdown > .nav-tabs > li.active:before, .tabset-dropdown > .nav-tabs.nav-tabs-open:before { +content: "\e259"; font-family: 'Glyphicons Halflings'; display: inline-block; padding: 10px; border-right: 1px solid #ddd; } .tabset-dropdown > .nav-tabs.nav-tabs-open > li.active:before { -content: "๎‰˜"; -border: none; -} -.tabset-dropdown > .nav-tabs.nav-tabs-open:before { -content: "๎‰™"; +content: "\e258"; font-family: 'Glyphicons Halflings'; -display: inline-block; -padding: 10px; -border-right: 1px solid #ddd; +border: none; } .tabset-dropdown > .nav-tabs > li.active { display: block; @@ -447,26 +443,39 @@

Phosphoproteomics using Tandem Mass

Intersecting phosphosites and total peptides and statistical testing

Tom Smith

-

2023-02-06

+

2024-07-02

-
library(camprotR)
-library(MSnbase)
-library(ggplot2)
-library(tidyr)
-library(dplyr)
-library(here)
-library(limma)
-library(uniprotREST)
+
+

Load dependencies

+

Load the required libraries.

+
+
+

Preamble

+

The correct approach for statistical testing for changes in +phosphorylation will depend on the details of the experimental design. +Here, we have quantified phospho and unmodified peptides, so we can +identify changes in phosphorylation which are not explained by changes +in the unmodified protein, e.g changes in the proportion of the protein +that is phosphorylated.

+

To acheive this, we will jointly model the phospho and unmodified +peptides which overlap each phosphosite, using a linear model with an +interaction term.

+
+
+

Input data

+

We start by reading in the MSnSets created in the previous notebook +QC +PSM-level quantification, filtering and summarisation to protein-level +abundance

total <- readRDS(here('results/total.rds'))
 phospho_sites <- readRDS(here('results/phospho_sites.rds'))

Below, we identify the total peptides that overlap each phosphosite. Note that each phosphosite may intersect multiple total peptides (due to missed cleavages) and each total peptide may intersect multiple -phosphosites (due to multiple phosphosites per peptide). The code chunk -takes a while to run..

+phosphosites (due to multiple phosphosites per peptide).

phospho_f <- phospho_sites %>% fData() %>% tibble::rownames_to_column('ID')
 total_f <- total %>% fData() %>% tibble::rownames_to_column('ID')
 
@@ -488,9 +497,7 @@ 

2023-02-06

phospho_peptide_to_total_peptides[[row$ID]] <- total_peptide_ids }
## Warning: NAs introduced by coercion
-
 ## Warning: NAs introduced by coercion
-
 ## Warning: NAs introduced by coercion

Now, how many overlapping features. Note that for the phospho and total to be considered intersecting, there must be at least one total @@ -609,8 +616,8 @@

2023-02-06

## 131_total x6 3 total

Below, we subset to the first 2 โ€˜conditionsโ€™, where condition 1 = tags 1-4 (x1 phospho; x1 total) and condition 2 = tags 5-7 (x6 phospho; -x2 total). The ground truth for the difference between the first 4 tags -and the next 3 is therefore a 3-fold increase in phosphorylation for the +x2 total). The ground truth for the difference between condition 2 and +condition 1 is therefore a 3-fold increase in phosphorylation for the yeast proteins and reduced phosphorylation for the human proteins. For the human proteins, given the amounts of spike in yeast and balancing human proteins to make up the labelled material, we expect a 30% drop in @@ -632,12 +639,78 @@

2023-02-06

## 128C_total x2 2 total ## 129N_total x2 2 total ## 129C_total x2 2 total -

Below, we perform the limma analysis

-
dat <- exprs(pairwise_comparison_combined_res) %>% log(base=2)
-
-type <- factor(pData(pairwise_comparison_combined_res)$type, level=c('total', 'phospho'))
+
+
+

Run limma

+

Below, we perform the limma analysis. First though, some +clarification on the model weโ€™re using.

+

We have two experimental variables: - type: phospho or total - +condition: 1x or 2x spike in

+
type <- factor(pData(pairwise_comparison_combined_res)$type, level=c('total', 'phospho'))
 condition <- factor(pData(pairwise_comparison_combined_res)$condition, levels=1:2)
-
+print(paste(type, condition))
+
##  [1] "phospho 1" "phospho 1" "phospho 1" "phospho 1" "phospho 2" "phospho 2"
+##  [7] "phospho 2" "total 1"   "total 1"   "total 1"   "total 1"   "total 2"  
+## [13] "total 2"   "total 2"
+

When we use an model with the terms type, condition and +type:condition, the interaction term captures the difference between the +observed data and the simple additive model when the abundance is +dependent upon just the separate effect of the type and condition +variables and there is no combinatorial effect.

+
# model without interaction term
+print(model.matrix(~type+condition))
+
##    (Intercept) typephospho condition2
+## 1            1           1          0
+## 2            1           1          0
+## 3            1           1          0
+## 4            1           1          0
+## 5            1           1          1
+## 6            1           1          1
+## 7            1           1          1
+## 8            1           0          0
+## 9            1           0          0
+## 10           1           0          0
+## 11           1           0          0
+## 12           1           0          1
+## 13           1           0          1
+## 14           1           0          1
+## attr(,"assign")
+## [1] 0 1 2
+## attr(,"contrasts")
+## attr(,"contrasts")$type
+## [1] "contr.treatment"
+## 
+## attr(,"contrasts")$condition
+## [1] "contr.treatment"
+
# model with an interaction term
+print(model.matrix(~type+condition+type:condition))
+
##    (Intercept) typephospho condition2 typephospho:condition2
+## 1            1           1          0                      0
+## 2            1           1          0                      0
+## 3            1           1          0                      0
+## 4            1           1          0                      0
+## 5            1           1          1                      1
+## 6            1           1          1                      1
+## 7            1           1          1                      1
+## 8            1           0          0                      0
+## 9            1           0          0                      0
+## 10           1           0          0                      0
+## 11           1           0          0                      0
+## 12           1           0          1                      0
+## 13           1           0          1                      0
+## 14           1           0          1                      0
+## attr(,"assign")
+## [1] 0 1 2 3
+## attr(,"contrasts")
+## attr(,"contrasts")$type
+## [1] "contr.treatment"
+## 
+## attr(,"contrasts")$condition
+## [1] "contr.treatment"
+

Below, we run the linear modeling. For more details about +limma, see the Differential +abundance testing for TMT proteomics notebook.

+
dat <- exprs(pairwise_comparison_combined_res) %>% log(base=2)
 study.design <- model.matrix(~type*condition)
 
 fit <- lmFit(dat, study.design)
@@ -646,7 +719,8 @@ 

2023-02-06

limma.results <- topTable(fit, coef = colnames(fit$coefficients)[4], n = Inf, confint=TRUE) limma.results$sigma <- fit$sigma

We would like to add information about the species to make sure the -fold-changes are in the expected direction

+fold-changes are in the expected direction. Weโ€™ll use the +uniprotREST package to query

limma.results$protein <- rownames(limma.results) %>%
   strsplit(split='_') %>%
   sapply('[[', 1) 
@@ -659,9 +733,12 @@ 

2023-02-06

mutate(species=recode(Organism, "Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast)"="S.cerevisiae", "Homo sapiens (Human)"="H.sapiens"))
-
## Job ID: ea3205dfeb2e33b7529b68ba58a550f55cabc604
-
## Page 1 of 1
-
## Success
+
## Running job: e6f48376e0a73829d0b31e381f5d84a9191fa8da 
+## Checking job status...
+## Job complete!
+##  Downloading: page 1 of 1
+

Below, we plot the limma results using a volcano plot, with two +panels, one for each species.

limma.results %>%
   merge(species_res, by.x='protein', by.y='From') %>%
   ggplot(aes(logFC, -log10(P.Value), fill=adj.P.Val<0.01)) +
@@ -672,10 +749,14 @@ 

2023-02-06

xlab('Log2 fold change') + ylab('-log10(p-value)') + scale_fill_manual(values=c('grey90', get_cat_palette(2)[2]), name='FDR < 1%')
-

-OK, so the direction of change is as expected (reduced for human, -increased for yeast), but only a subset of the fold-changes reach a 1% -FDR threshold for significance.

+

+

The direction of change is as expected (reduced for human, increased +for yeast), but only a subset of the fold-changes reach a 1% FDR +threshold for significance. The fold-changes are appoximately what we +would expect too: human phosphosites should be reduced by 30% = -0.51 on +a log2 scale and the yeast phosphosites should be increased by 3-fold = +1.58 on a log2 scale.

+
diff --git a/Markdowns/results/benchmark_tmt_phospho_psm_filt.rds b/Markdowns/results/benchmark_tmt_phospho_psm_filt.rds index e62b3aa..867efd5 100644 Binary files a/Markdowns/results/benchmark_tmt_phospho_psm_filt.rds and b/Markdowns/results/benchmark_tmt_phospho_psm_filt.rds differ diff --git a/Markdowns/results/phospho_sites.rds b/Markdowns/results/phospho_sites.rds index 0960ffc..bee7524 100644 Binary files a/Markdowns/results/phospho_sites.rds and b/Markdowns/results/phospho_sites.rds differ diff --git a/Markdowns/results/total.rds b/Markdowns/results/total.rds index 6199def..7f571cb 100644 Binary files a/Markdowns/results/total.rds and b/Markdowns/results/total.rds differ diff --git a/index.md b/index.md index fd3e855..5797d20 100644 --- a/index.md +++ b/index.md @@ -75,7 +75,8 @@ obtain the quantification data Additional subsections are included to cover further topics for each flavour. - +In addition to the core part of the course, there are extended materials to cover: +- Phosphoproteomics using Tandem Mass Tags ### 1. Label-Free Quantification (LFQ) @@ -105,10 +106,11 @@ Additional subsections are included to cover further topics for each flavour. - [Incorporation rate testing](https://mrctoxbioinformatics.github.io/Proteomics_data_analysis/Markdowns/SILAC_incorporation.html) +## Extended materials +- [Phosphoproteomics using TMT](https://mrctoxbioinformatics.github.io/Proteomics_data_analysis/Markdowns/TMT_phospho.html) - - +- [Phosphoproteomics statistical testing](https://mrctoxbioinformatics.github.io/Proteomics_data_analysis/Markdowns/TMT_phospho_stats.html) ## Additional resources