Should we skip downsampling before running PCA if we have low number of cells to start with? #99
Replies: 1 comment 1 reply
-
@denvercal1234GitHub It is ideal to either have equivalent numbers of cells from each sample/group in the plots (we usually do this with blood/PBMCs), or to downsample in a group/sample-weighted method (e.g. what we do with our CNS data). However, in situations where the numbers of cells in some files is extremely low, it might be very difficult to downsample in a truthful way and maintain enough to see small populations. If this is the case, then it is OK to just use what you have -- just make sure to describe clearly in your figure legend and/or methods, so that differences in cell numbers on the plot are not interpreted to be reflective of what is happening in the actual sample. In an ideal world there would be plenty of cells in each sample, but that's not always realistic, and this approach allows you to manage that. For example, in a recent COVID-19 paper we did both -- one panel we downsample in a group-weighted sense because we had enough cells (so differences in the number of cells on the plot represent the difference in cells/uL), but in another we didn't because of low cell numbers (so we just used what we had). |
Beta Was this translation helpful? Give feedback.
-
Hi there,
I have quite a small number of cell per sample (below); some samples have much less cells than others. Is it a "bad" bias to not do downsampling before running dimensionality reduction?
Thank you for your guidance!
Beta Was this translation helpful? Give feedback.
All reactions