-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does cellbender work with citeseq data? #114
Comments
@smk5g5 Great question, we definitely hope to point this out more in the upcoming paper, since this data wasn't so common when we first wrote the CellBender package. |
One more thing to point out is that it is probably beneficial to try several values of "--fpr" when working with antibody capture data. Since there is so much additional background noise for most antibody capture, sometimes it is helpful to increase the FPR in order to increase noise removal. |
Add mention of this in the docs |
I gave cellbender a try on CITE data based on this discussion, and the results have been encouraging. I had to use a stratospheric 0.9 for the FPR, but in the end I was able to coax out profiles with a good portion of the soup removed. Given the fact CITE is inherently soupier than GEX via experimental factors, would it make sense to model their backgrounds separately? Maybe have a more trigger-happy model for the CITE? Accepting separate FPRs for the two modalities seems like a reasonable heuristic, but maybe there's something smarter that could be done at an algorithm level? |
@ktpolanski you make a very interesting point. With the public 10x Genomics pbmc5k dataset above, things kind of worked out that the gene expression and antibody features both cleaned up very nicely at about FPR 0.1, which is pretty reasonable. If you are having to go to FPR 0.9, that does represent a massive deviation from what we'd expect. Two-part answer: (1) We will (within a few weeks) be releasing cellbender v0.3.0, which constructs the denoised count matrix in a slightly different way from v0.2. In particular, it has per-feature noise removal targets that it tries to hit, based on the dataset. These per-feature targets might be exactly what's needed to fix the issue you're seeing. (2) If that does not end up fixing the issue you've described, then it is reasonable to consider different alternatives. We'd definitely like it to work well on both modalities without having to run twice using different FPR settings. As you say, it does make sense to model their backgrounds separately. But the current model does basically model all features separately. If one feature has way higher background than another (or one feature type), then the model should be able to learn that without a problem. If (1) doesn't fix the issue, then we might need to consider some tweaks to the model in the medium-term, starting out with figuring out why the antibody features maybe are not obeying the assumptions in our model, or if there is some other noise mechanism at play for antibody capture features (though I don't see what it would be...) |
Thanks for the response, all of the above sounds very promising. Looking forward to trying out v0.3.0. |
Is there an estimate of when the v.0.3.0 will be released? Thanks in advance. |
@mdmanurung |
Official release will follow merging of this PR, #189 |
Does cellbender work with 10x citeseq data?
The text was updated successfully, but these errors were encountered: