Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resampling to evaluate clustering stability #27

Open
bednarsky opened this issue Sep 13, 2023 · 0 comments
Open

Resampling to evaluate clustering stability #27

bednarsky opened this issue Sep 13, 2023 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@bednarsky
Copy link

Summary:
Clustering stability could be assessed by doing multiple clusterings, always randomly sampling 90% of the data.
Consensus approach could be used to extract stable clustering.

Drawbacks:
Computationally expensive

Open questions:
How to combine this with clustification? Should it be only used to evaluate stability of one approach, or automatically to generate consensus clustering?

Background:
Daria implemented a resampling strategy, because she noted that adding two new samples completely changed her previous clustering. She ended up finding gene programs by seeing which genes are stably co-differential in clusters, and then came up with hard clusters by using thresholds to assign cells to one or multiple cluster labels.

@sreichl sreichl added the enhancement New feature or request label Sep 13, 2023
@sreichl sreichl self-assigned this Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants