Retinoblastoma analysis (SCPCP000011) #588

patelgrp · 2024-07-10T18:50:44Z

patelgrp
Jul 10, 2024

Proposed analysis

We will annotate the collection of retinoblast datasets covered in SCPCP000011 (n=26) samples. Our pipeline involves a series of cleanup, QC filtering, and automated annotation steps prior to expert-guided refinement. Specifically:

we will start with count matrices generated by ALSF
we will perform doublet prediction using demuxafy
we will filter low-quality cells/nuclei (using number of genes/cell and percent of UMIs from mitochondrial genes as filter thresholds)
we will perform copy number calling using inferCNV and, if aligned bam files are available, numbat
we will then use a 3 tiered approach to annotate non-malignant clusters:
using a pure merge of all the datasets, we will identify cells/nuclei that high grade of mixing as measured using the LISI index. These clusters have a high likelihood of being non-malignant stroma because inter-individual variation in those cell types, in our experience, is low.
we perform automated annotation using SingleR with the human cell atlas as a reference. Those cells with annotation to endothelium or immune cells will be annotated as non-malignant
finally, those clusters with absence of inferred copy number alterations will be annotated as non-malignant
Finally, we will use Seurat's layer integration pipeline to generate an integrated dataset (in our hands scVI or reciprocal PCA perform the best).

Of note, we have already annotated a majority of these datasets in a prior published analysis (https://www.nature.com/articles/s41467-021-24781-7)

Scientific goals

To annotate the non-malignant and malignant cells within SCPCP000011

Methods or approach

we will start with count matrices generated by ALSF
we will perform doublet prediction using demuxafy
we will filter low-quality cells/nuclei (using number of genes/cell and percent of UMIs from mitochondrial genes as filter thresholds)
we will perform copy number calling using inferCNV and, if aligned bam files are available, numbat
we will then use a 3 tiered approach to annotate non-malignant clusters:
using a pure merge of all the datasets, we will identify cells/nuclei that high grade of mixing as measured using the LISI index. These clusters have a high likelihood of being non-malignant stroma because inter-individual variation in those cell types, in our experience, is low.
we perform automated annotation using SingleR with the human cell atlas as a reference. Those cells with annotation to endothelium or immune cells will be annotated as non-malignant
finally, those clusters with absence of inferred copy number alterations will be annotated as non-malignant
Finally, we will use Seurat's layer integration pipeline to generate an integrated dataset (in our hands scVI or reciprocal PCA perform the best).

Existing modules

Yes, this module is based on an existing module of Ewing sarcoma samples in #292 (comment).

Input data

The analysis will use count matrices extracted from the SingleCellExperiment objects for SCPCP000011

Scientific literature

https://www.nature.com/articles/s41467-021-24781-7

Other details

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retinoblastoma analysis (SCPCP000011) #588

{{title}}

Replies: 0 comments

Select a reply

Retinoblastoma analysis (SCPCP000011) #588

patelgrp Jul 10, 2024

Proposed analysis

Scientific goals

Methods or approach

Existing modules

Input data

Scientific literature

Other details

Replies: 0 comments

patelgrp
Jul 10, 2024