Retinoblastoma analysis (SCPCP000011) #588
patelgrp
started this conversation in
Propose a new analysis
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Proposed analysis
We will annotate the collection of retinoblast datasets covered in SCPCP000011 (n=26) samples. Our pipeline involves a series of cleanup, QC filtering, and automated annotation steps prior to expert-guided refinement. Specifically:
we will start with count matrices generated by ALSF
we will perform doublet prediction using demuxafy
we will filter low-quality cells/nuclei (using number of genes/cell and percent of UMIs from mitochondrial genes as filter thresholds)
we will perform copy number calling using inferCNV and, if aligned bam files are available, numbat
we will then use a 3 tiered approach to annotate non-malignant clusters:
using a pure merge of all the datasets, we will identify cells/nuclei that high grade of mixing as measured using the LISI index. These clusters have a high likelihood of being non-malignant stroma because inter-individual variation in those cell types, in our experience, is low.
we perform automated annotation using SingleR with the human cell atlas as a reference. Those cells with annotation to endothelium or immune cells will be annotated as non-malignant
finally, those clusters with absence of inferred copy number alterations will be annotated as non-malignant
Finally, we will use Seurat's layer integration pipeline to generate an integrated dataset (in our hands scVI or reciprocal PCA perform the best).
Of note, we have already annotated a majority of these datasets in a prior published analysis (https://www.nature.com/articles/s41467-021-24781-7)
Scientific goals
To annotate the non-malignant and malignant cells within SCPCP000011
Methods or approach
we will start with count matrices generated by ALSF
we will perform doublet prediction using demuxafy
we will filter low-quality cells/nuclei (using number of genes/cell and percent of UMIs from mitochondrial genes as filter thresholds)
we will perform copy number calling using inferCNV and, if aligned bam files are available, numbat
we will then use a 3 tiered approach to annotate non-malignant clusters:
using a pure merge of all the datasets, we will identify cells/nuclei that high grade of mixing as measured using the LISI index. These clusters have a high likelihood of being non-malignant stroma because inter-individual variation in those cell types, in our experience, is low.
we perform automated annotation using SingleR with the human cell atlas as a reference. Those cells with annotation to endothelium or immune cells will be annotated as non-malignant
finally, those clusters with absence of inferred copy number alterations will be annotated as non-malignant
Finally, we will use Seurat's layer integration pipeline to generate an integrated dataset (in our hands scVI or reciprocal PCA perform the best).
Existing modules
Yes, this module is based on an existing module of Ewing sarcoma samples in #292 (comment).
Input data
The analysis will use count matrices extracted from the SingleCellExperiment objects for SCPCP000011
Scientific literature
https://www.nature.com/articles/s41467-021-24781-7
Other details
No response
Beta Was this translation helpful? Give feedback.
All reactions