-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
79 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
--- | ||
title: "Dataset investigation: What to do when you get your data" | ||
format: html | ||
editor: visual | ||
minimal: true | ||
date: 'Compiled: `r format(Sys.Date(), "%B %d, %Y")`' | ||
vignette: > | ||
%\VignetteIndexEntry{Dataset investigation: What to do when you get your data} | ||
%\VignetteEngine{quarto::html} | ||
%\VignetteEncoding{UTF-8} | ||
--- | ||
|
||
## In Construction | ||
|
||
```{r setup, include=FALSE} | ||
library(knitr) | ||
library(quarto) | ||
knitr::opts_knit$set(root.dir = './') | ||
``` | ||
|
||
# Introduction | ||
|
||
So, you (or your amazing lab mate) have finally finished the data acquisition, | ||
and now you have a dataset in hand. What's next? Unfortunately, the work isn't | ||
over yet. Before diving into any analysis, it's crucial to understand the | ||
dataset itself. This is the first step in any data analysis workflow, ensuring | ||
that the data is of good quality and is well-prepared for preprocessing and any | ||
downstream analysis you plan to perform. | ||
|
||
In this vignette, we present the dataset used throughout the different vignettes | ||
of this website. It's far from a *perfect* dataset, which actually mirrors the | ||
reality of most datasets you'll encounter in research. | ||
|
||
Some issues will indeed be specific to this described dataset. However, the | ||
purpose of this vignette is to encourage you to think critically about your data | ||
and guide you through steps that can help you avoid spending hours on an | ||
analysis, only to realize later that some samples or features should have been | ||
removed or flagged earlier on. | ||
|
||
# Dataset Description | ||
|
||
In this workflow, two datasets are used: | ||
|
||
1. An LC-MS-based (MS1 level only) untargeted metabolomics dataset to quantify | ||
small polar metabolites in human plasma samples. | ||
2. An additional LC-MS/MS dataset of selected samples from the former study for | ||
the identification and annotation of significant features. | ||
|
||
The samples were randomly selected from a larger study aimed at identifying | ||
metabolites with varying abundances between individuals suffering from | ||
cardiovascular disease (CVD) and healthy controls (CTR). The subset analyzed | ||
here includes data for three CVD patients, three CTR individuals, and four | ||
quality control (QC) samples. The QC samples, representing a pooled serum sample | ||
from a large cohort, were measured repeatedly throughout the experiment to | ||
monitor signal stability. | ||
|
||
The data and metadata for this workflow are available on the MetaboLights | ||
database under the ID: MTBLS8735. | ||
|
||
The detailed materials and methods used for the sample analysis are also | ||
available in the MetaboLights entry. This is particularly important for | ||
understanding the analysis and the parameters used. It should be noted that the | ||
samples were analyzed using ultra-high-performance liquid chromatography (UHPLC) | ||
coupled to a Q-TOF mass spectrometer (TripleTOF 5600+), and chromatographic | ||
separation was achieved using hydrophilic interaction liquid chromatography | ||
(HILIC). | ||
|
||
- Consider moving visualizations from the end-to-end vignette to here for a | ||
clearer understanding of the dataset. | ||
- Provide more in-depth visualizations to explore and understand the dataset | ||
quality. | ||
- Compare pool lc-ms and pool lc-ms/ms and show that we have better separation | ||
on the second run. | ||
|
||
```{r} | ||
getwd() | ||
list.files() | ||
list.dirs() | ||
``` |