Explore computing metacells #362
Replies: 2 comments 1 reply
-
Noting this metacell toolkit which might be helpful: https://github.com/GfellerLab/MetacellAnalysisToolkit |
Beta Was this translation helpful? Give feedback.
-
After reviewing the literature some more, I think I will probably start with SEACells. It seems to outperform MetaCell2 in compactness and separation as well as cell type purity. Unfortunately, that is the only benchmark they seem to have computed. I may do a quick comparison of runtimes with the SuperCell package as well, but my plan is to implement each method in a separate PR, then create a comparison notebook for analysis. I will start filing some issues to begin the analysis soon. |
Beta Was this translation helpful? Give feedback.
-
Proposed analysis
We should explore some of the methods for computing metacell reductions of single-cell data, including the MetaCells2 algorithm and SEACells.
Scientific goals
The full ScPCA dataset is quite large, and individual cell data can be quite noisy, making some inference difficult. There have been a number of methods proposed to combine information across cells into "metacells", that can be used in downstream analysis.
We should explore creating reduced datasets using a metacell algorithm. Initial exploration may include some benchmarking of different methods to decide on a method to apply across ScPCA.
Once decided, we would implement a workflow to calculate metacells across all samples, This could then be provided for use in various downstream analyses to reduce runtimes and improve robustness, including cell typing, integration, differential expression analysis, etc.
Methods or approach
A couple of methods are out there now, and more may come:
I would expect development to occur in two phases:
Some tutorials/examples that may be useful:
Existing modules
None
Input data
All of it! The analysis will likely start with the processed data AnnData/hdf5 files, and a subset thereof for the initial testing and benchmarking.
Scientific literature
Persad et al. (2023) SEACells infers transcriptional and epigenomic cellular states from single-cell genomics data https://doi.org/10.1038/s41587-023-01716-9
Ben-Kiki et al. (2022) Metacell-2: a divide-and-conquer metacell algorithm for scalable scRNA-seq analysis. https://doi.org/10.1186/s13059-022-02667-1
Ben-Kiki et al. (2023) MCProj: metacell projection for interpretable and quantitative use of transcriptional atlases. Genome Biol 24, 220 (2023). https://doi.org/10.1186/s13059-023-03069-7
Baran et al. (2019) MetaCell: analysis of single-cell RNA-seq data using K-nn graph partitions. https://doi.org/10.1186/s13059-019-1812-2
Other details
The earlier we get this done, the more other analyses it will be available for!
Beta Was this translation helpful? Give feedback.
All reactions