Neuroblastoma analysis (SCPCP000004) #589

patelgrp · 2024-07-10T18:53:34Z

patelgrp
Jul 10, 2024

Proposed analysis

We will annotate the collection of osteosarcoma datasets covered in SCPCP000004 (n=40) samples. Our pipeline involves a series of cleanup, QC filtering, and automated annotation steps prior to expert-guided refinement. Specifically:

we will start with count matrices generated by ALSF
we will perform doublet prediction using demuxafy
we will filter low-quality cells/nuclei (using number of genes/cell and percent of UMIs from mitochondrial genes as filter thresholds)
we will perform copy number calling using inferCNV and, if aligned bam files are available, numbat
we will then use a 3 tiered approach to annotate non-malignant clusters:
using a pure merge of all the datasets, we will identify cells/nuclei that high grade of mixing as measured using the LISI index. These clusters have a high likelihood of being non-malignant stroma because inter-individual variation in those cell types, in our experience, is low.
we perform automated annotation using SingleR with the human cell atlas as a reference. Those cells with annotation to endothelium or immune cells will be annotated as non-malignant
finally, those clusters with absence of inferred copy number alterations will be annotated as non-malignant
Finally, we will use Seurat's layer integration pipeline to generate an integrated dataset (in our hands scVI or reciprocal PCA perform the best).

Of note, we have performed annotation of many of these samples in two manuscripts:
https://www.biorxiv.org/content/10.1101/2024.01.07.574538v2
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-024-03309-4

Scientific goals

To annotate the non-malignant and malignant cells within SCPCP000004

Methods or approach

we will start with count matrices generated by ALSF
we will perform doublet prediction using demuxafy
we will filter low-quality cells/nuclei (using number of genes/cell and percent of UMIs from mitochondrial genes as filter thresholds)
we will perform copy number calling using inferCNV and, if aligned bam files are available, numbat
we will then use a 3 tiered approach to annotate non-malignant clusters:
using a pure merge of all the datasets, we will identify cells/nuclei that high grade of mixing as measured using the LISI index. These clusters have a high likelihood of being non-malignant stroma because inter-individual variation in those cell types, in our experience, is low.
we perform automated annotation using SingleR with the human cell atlas as a reference. Those cells with annotation to endothelium or immune cells will be annotated as non-malignant
finally, those clusters with absence of inferred copy number alterations will be annotated as non-malignant
Finally, we will use Seurat's layer integration pipeline to generate an integrated dataset (in our hands scVI or reciprocal PCA perform the best).

Existing modules

Yes, this module is based on an existing module of Ewing sarcoma samples in #292 (comment).

Input data

The analysis will use count matrices extracted from the SingleCellExperiment objects for SCPCP000004

Scientific literature

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-024-03309-4
https://www.biorxiv.org/content/10.1101/2024.01.07.574538v2

Other details

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Neuroblastoma analysis (SCPCP000004) #589

{{title}}

Replies: 0 comments

Select a reply

Neuroblastoma analysis (SCPCP000004) #589

patelgrp Jul 10, 2024

Proposed analysis

Scientific goals

Methods or approach

Existing modules

Input data

Scientific literature

Other details

Replies: 0 comments

patelgrp
Jul 10, 2024