Neuroblastoma analysis (SCPCP000004) #589
patelgrp
started this conversation in
Propose a new analysis
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Proposed analysis
We will annotate the collection of osteosarcoma datasets covered in SCPCP000004 (n=40) samples. Our pipeline involves a series of cleanup, QC filtering, and automated annotation steps prior to expert-guided refinement. Specifically:
we will start with count matrices generated by ALSF
we will perform doublet prediction using demuxafy
we will filter low-quality cells/nuclei (using number of genes/cell and percent of UMIs from mitochondrial genes as filter thresholds)
we will perform copy number calling using inferCNV and, if aligned bam files are available, numbat
we will then use a 3 tiered approach to annotate non-malignant clusters:
using a pure merge of all the datasets, we will identify cells/nuclei that high grade of mixing as measured using the LISI index. These clusters have a high likelihood of being non-malignant stroma because inter-individual variation in those cell types, in our experience, is low.
we perform automated annotation using SingleR with the human cell atlas as a reference. Those cells with annotation to endothelium or immune cells will be annotated as non-malignant
finally, those clusters with absence of inferred copy number alterations will be annotated as non-malignant
Finally, we will use Seurat's layer integration pipeline to generate an integrated dataset (in our hands scVI or reciprocal PCA perform the best).
Of note, we have performed annotation of many of these samples in two manuscripts:
https://www.biorxiv.org/content/10.1101/2024.01.07.574538v2
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-024-03309-4
Scientific goals
To annotate the non-malignant and malignant cells within SCPCP000004
Methods or approach
we will start with count matrices generated by ALSF
we will perform doublet prediction using demuxafy
we will filter low-quality cells/nuclei (using number of genes/cell and percent of UMIs from mitochondrial genes as filter thresholds)
we will perform copy number calling using inferCNV and, if aligned bam files are available, numbat
we will then use a 3 tiered approach to annotate non-malignant clusters:
using a pure merge of all the datasets, we will identify cells/nuclei that high grade of mixing as measured using the LISI index. These clusters have a high likelihood of being non-malignant stroma because inter-individual variation in those cell types, in our experience, is low.
we perform automated annotation using SingleR with the human cell atlas as a reference. Those cells with annotation to endothelium or immune cells will be annotated as non-malignant
finally, those clusters with absence of inferred copy number alterations will be annotated as non-malignant
Finally, we will use Seurat's layer integration pipeline to generate an integrated dataset (in our hands scVI or reciprocal PCA perform the best).
Existing modules
Yes, this module is based on an existing module of Ewing sarcoma samples in #292 (comment).
Input data
The analysis will use count matrices extracted from the SingleCellExperiment objects for SCPCP000004
Scientific literature
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-024-03309-4
https://www.biorxiv.org/content/10.1101/2024.01.07.574538v2
Other details
No response
Beta Was this translation helpful? Give feedback.
All reactions