Wilms tumor annotation (SCPCP000014) #628
Replies: 2 comments 5 replies
-
Hi @UTSouthwesternDSSR. I'm Jen, the Scientific Community Manager at the Data Lab. Thank you for sharing your proposed analysis! Have you filled out the contributor form yet? On this form, you will provide the name and email address that will be associated with the AWS account that we'll create for you. We also need this form returned to ensure you have agreed to the OpenScPCA terms and conditions and other policies. Once we receive this, our team will review your proposed analyses and get back to you with next steps within 3 business days! In the meantime, please let us know if you have any questions about OpenScPCA. We look forward to discussing more with you soon! |
Beta Was this translation helpful? Give feedback.
-
Hi @UTSouthwesternDSSR! I'm Stephanie, one of the Data Scientists in the Data Lab. We're looking forward to having you on board as an OpenScPCA contributor! Before you get started, I wanted to provide some additional guidance. First, we very much appreciate your interest in performing cell type annotation on multiple projects (this discussion, #629, and #630)! Given that you are proposing similar analysis pipelines across each project, we recommend that you begin your OpenScPCA contribution with only one ScPCA project. You can feel free to choose which project you'd like to start with. After code for your first analysis module has been developed, reviewed/approved, and added to the repository, you can begin to apply that code to the other projects you'd like to annotate. Once we reach that stage, we can also check in further to discuss how your module(s) and shared code will be organized. Next, I'll offer some specific feedback about your proposed approach:
We indeed do recommend starting with the ScPCA processed count matrices (
We are actually right now in the process of running
Note that PCA and UMAP coordinates are already calculated in the processed objects you'll be using, but we recommend you re-calculate clusters rather than relying on those in the processed object. Once you are ready to start your analysis, please follow the below steps to start contributing to the project:
After this PR has been reviewed and accepted, you will be ready to continue with the rest of the analysis that you proposed. A key part of your analysis will involve scoping your work to ensure slow-and-steady modular progress towards the final cell type annotations. I would recommend that you break up your work into the following steps, where each bullet point would be an issue and at least one subsequent pull request (PR).
Finally, I just want to add some clarification to some of your opening discussion comments:
First, just an FYI that as an OpenScPCA contributor, you will have access to virtual computers which may be helpful for running analyses you can't run on a laptop. Depending on your HPC setup and what kinds of permissions you have to install dependencies and software environments, you might find it more convenient to work on a virtual computer that we offer.
Please bear in mind that, as an OpenScPCA contributor, your work will be conducted openly and iteratively. This means that, although the final results will of course take some time to fully establish, your analysis code will be openly available the entire time you are working. A two month timeline to complete cell type annotation is perfectly fine; I just want to highlight that we'll be heavily engaging throughout this process, rather than you working independently for 2 months and then sending over results. You can find more information about this in our policies, notably this one:
Let us know if you have any questions or comments, or want to generally discuss anything else! We're happy to have you on board! |
Beta Was this translation helpful? Give feedback.
-
Proposed analysis
We plan to annotate cell types for the Wilms tumor samples (n=10) in SCPCP000014. Our analysis involves data clean-up for low quality nuclei and doublets, cell type annotation, and tumor cell identification.
Scientific goals
The goal of this analysis is to curate a validated cell type annotation for Wilms tumor samples in the portal (SCPCP000014). Specifically, we aim to generate following outcomes: (i) Lists of marker genes to identify cell types in Wilms tumor; (ii) Identification of tumor cells from normal cells; (iii) Refined annotation of cell types among normal cells; (iv) Annotation of sub-groups among tumor cells, if applicable.
Methods or approach
We will start with the processed count matrices provided by ALSF.
We will remove doublets if there is any in each sample, using available tools like
DoubletFinder
.We will first perform an automated cell type annotation using
SingleR
with public references, including datasets from literatures. Then we will run PCA, clustering, and UMAP visualization, where we expect same cell types cluster together.The annotation generated in step 3 will then be complemented with manually curated marker gene lists, using tools like
enrichr
orcellassign
.We will use CNV-based tools like
CopyKat
to classify tumor and normal cells. Those cells with absence of inferred copy number alterations will be annotated as normal cells.Finally, we will merge the whole cohort together, and perform PCA, clustering, and UMAP visualization. We expect normal cell type to cluster together, and tumor cells to be separated by inter-sample heterogeneity. We will finetune the cell type annotation, if any cell groups with other cell types.
Existing modules
This module is based on the existing annotation workflow of Ewing sarcoma samples in #292, with some modification.
Input data
This analysis will use processed count matrices in
SingleCellExperiment
object from SCPCP000014.Scientific literature
The following datasets might be useful when curating marker gene lists and/or cell type references:
https://pubmed.ncbi.nlm.nih.gov/36718830/
https://pubmed.ncbi.nlm.nih.gov/30093597/
Other details
This analysis will be performed on our local machine and HPC, and will be conducted in R. We plan to share the annotation within two months.
Beta Was this translation helpful? Give feedback.
All reactions