Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in dataset simulation: length(class) == 1L is not TRUE #79

Open
YiyiChen0328 opened this issue Jul 31, 2021 · 2 comments
Open

error in dataset simulation: length(class) == 1L is not TRUE #79

YiyiChen0328 opened this issue Jul 31, 2021 · 2 comments

Comments

@YiyiChen0328
Copy link

Hi Muscat team,

I am trying to simulate a dataset, and when I try prepSim, this error occurs:

> > ref <- prepSim(DC.sce, verbose = TRUE) 
> Argument `group_keep` unspecified; defaulting to retaining “Healthy”-group samples.
> Filtering...
> - 4610/27248 genes and 319/2236 cells retained.
> - 1/2 subpopulations and 2/5 samples retained.
> Error in getS3method(f, signature, optional = TRUE) : 
>   length(class) == 1L is not TRUE

The Dc.sce was converted from a Seurat object and the code I am using is:

DC_Diet <- DietSeurat(DC)
DefaultAssay(DC_Diet) <- "RNA"
DC.sce <- as.SingleCellExperiment(DC_Diet)

I compared the differences between Dc.sce and the example sce, the biggest difference is the $metadata.

image
image

Could you please help me figure this out?

Many thanks,
Yiyi

R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8       
 [4] LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] dplyr_1.0.7                 SingleCellExperiment_1.14.1 SummarizedExperiment_1.22.0
 [4] Biobase_2.52.0              GenomicRanges_1.44.0        GenomeInfoDb_1.28.1        
 [7] IRanges_2.26.0              S4Vectors_0.30.0            BiocGenerics_0.38.0        
[10] MatrixGenerics_1.4.0        matrixStats_0.59.0          SeuratObject_4.0.2         
[13] Seurat_4.0.3                muscat_1.7.1               

loaded via a namespace (and not attached):
  [1] utf8_1.2.1                reticulate_1.20           tidyselect_1.1.1         
  [4] lme4_1.1-27.1             RSQLite_2.2.7             AnnotationDbi_1.54.1     
  [7] htmlwidgets_1.5.3         grid_4.1.0                BiocParallel_1.26.1      
 [10] Rtsne_0.15                devtools_2.4.2            munsell_0.5.0            
 [13] ScaledMatrix_1.0.0        codetools_0.2-18          ica_1.0-2                
 [16] future_1.21.0             miniUI_0.1.1.1            withr_2.4.2              
 [19] colorspace_2.0-2          rstudioapi_0.13           ROCR_1.0-11              
 [22] tensor_1.5                listenv_0.8.0             GenomeInfoDbData_1.2.6   
 [25] polyclip_1.10-0           bit64_4.0.5               rprojroot_2.0.2          
 [28] glmmTMB_1.1.2             parallelly_1.26.1         vctrs_0.3.8              
 [31] generics_0.1.0            R6_2.5.0                  doParallel_1.0.16        
 [34] ggbeeswarm_0.6.0          clue_0.3-59               rsvd_1.0.5               
 [37] locfit_1.5-9.4            spatstat.utils_2.2-0      bitops_1.0-7             
 [40] cachem_1.0.5              DelayedArray_0.18.0       assertthat_0.2.1         
 [43] promises_1.2.0.1          scales_1.1.1              beeswarm_0.4.0           
 [46] gtable_0.3.0              beachmat_2.8.0            Cairo_1.5-12.2           
 [49] globals_0.14.0            processx_3.5.2            goftest_1.2-2            
 [52] rlang_0.4.11              genefilter_1.74.0         GlobalOptions_0.1.2      
 [55] splines_4.1.0             TMB_1.7.20                lazyeval_0.2.2           
 [58] spatstat.geom_2.2-0       broom_0.7.8               BiocManager_1.30.16      
 [61] abind_1.4-5               reshape2_1.4.4            backports_1.2.1          
 [64] httpuv_1.6.1              usethis_2.0.1             tools_4.1.0              
 [67] ggplot2_3.3.5             spatstat.core_2.2-0       ellipsis_0.3.2           
 [70] gplots_3.1.1              RColorBrewer_1.1-2        sessioninfo_1.1.1        
 [73] ggridges_0.5.3            Rcpp_1.0.7                plyr_1.8.6               
 [76] sparseMatrixStats_1.4.0   progress_1.2.2            zlibbioc_1.38.0          
 [79] purrr_0.3.4               RCurl_1.98-1.3            ps_1.6.0                 
 [82] prettyunits_1.1.1         rpart_4.1-15              deldir_0.2-10            
 [85] pbapply_1.4-3             GetoptLong_1.0.5          viridis_0.6.1            
 [88] cowplot_1.1.1             zoo_1.8-9                 ggrepel_0.9.1            
 [91] cluster_2.1.2             fs_1.5.0                  colorRamps_2.3           
 [94] variancePartition_1.22.0  magrittr_2.0.1            data.table_1.14.0        
 [97] scattermore_0.7           lmerTest_3.1-3            circlize_0.4.13          
[100] lmtest_0.9-38             RANN_2.6.1                fitdistrplus_1.1-5       
[103] pkgload_1.2.1             hms_1.1.0                 patchwork_1.1.1          
[106] mime_0.11                 xtable_1.8-4              pbkrtest_0.5.1           
[109] XML_3.99-0.6              gridExtra_2.3             shape_1.4.6              
[112] testthat_3.0.4            compiler_4.1.0            scater_1.20.1            
[115] tibble_3.1.2              KernSmooth_2.23-20        crayon_1.4.1             
[118] minqa_1.2.4               htmltools_0.5.1.1         mgcv_1.8-36              
[121] later_1.2.0               tidyr_1.1.3               geneplotter_1.70.0       
[124] DBI_1.1.1                 ComplexHeatmap_2.8.0      MASS_7.3-54              
[127] boot_1.3-28               Matrix_1.3-4              cli_3.0.0                
[130] igraph_1.2.6              pkgconfig_2.0.3           numDeriv_2016.8-1.1      
[133] spatstat.sparse_2.0-0     plotly_4.9.4.1            scuttle_1.2.0            
[136] foreach_1.5.1             annotate_1.70.0           vipor_0.4.5              
[139] blme_1.0-5                XVector_0.32.0            callr_3.7.0              
[142] stringr_1.4.0             digest_0.6.27             sctransform_0.3.2        
[145] RcppAnnoy_0.0.18          spatstat.data_2.1-0       Biostrings_2.60.1        
[148] leiden_0.3.8              uwot_0.1.10               edgeR_3.34.0             
[151] DelayedMatrixStats_1.14.0 shiny_1.6.0               gtools_3.9.2             
[154] rjson_0.2.20              nloptr_1.2.2.2            lifecycle_1.0.0          
[157] nlme_3.1-152              jsonlite_1.7.2            BiocNeighbors_1.10.0     
[160] desc_1.3.0                viridisLite_0.4.0         limma_3.48.1             
[163] fansi_0.5.0               pillar_1.6.1              lattice_0.20-44          
[166] pkgbuild_1.2.0            KEGGREST_1.32.0           fastmap_1.1.0            
[169] httr_1.4.2                survival_3.2-11           remotes_2.4.0            
[172] glue_1.4.2                png_0.1-7                 iterators_1.0.13         
[175] bit_4.0.4                 stringi_1.6.2             blob_1.2.1               
[178] DESeq2_1.32.0             BiocSingular_1.8.1        caTools_1.18.2           
[181] memoise_2.0.0             irlba_2.3.3               future.apply_1.7.0 
@HelenaLC
Copy link
Owner

HelenaLC commented Aug 4, 2021

The metadata is not used at any point by prepSim(), so this cannot be causing the issue. Could you run a traceback() after the error occurs as to pin-point what might be causing the issue?

@YiyiChen0328
Copy link
Author

Hi Helena,

Thank you for response!

#####----------1.convert Seurat object--------####

DC = readRDS("./dataForMuscat/newCD1c.rds")
DC_Diet <- DietSeurat(DC)
DefaultAssay(DC_Diet) <- "RNA"
DC.sce <- as.SingleCellExperiment(DC_Diet)

#required parameter as.factor 
DC.sce$sample_id <- as.factor(DC.sce$sample)
DC.sce$group_id <- as.factor(DC.sce$group)
DC.sce$cluster_id <- as.factor(DC.sce$DCType)

#Sce object 
DC.sce 
#output:
# class: SingleCellExperiment 
# dim: 27248 2236 
# metadata(0):
#   assays(2): counts logcounts
# rownames(27248): AL627309.1 AL669831.5 ... DUOXA2 RP11-58A18.2
# rowData names(0):
#   colnames(2236): ACAGCCGGTATGGAAT_1_1 ACGGTCGGTATTGACC_1_1 ... SA225_TGACTTTTCACAATGC_2
# SA225_TGACTTTTCCTTGCCA_2
# colData names(23): orig.ident nCount_RNA ... group_id cluster_id
# reducedDimNames(0):
#   mainExpName: RNA
# altExpNames(1): completeIntegration
table(DC.sce$sample_id) 
#output:
# 10K PBMCs 10x Genomics                 HC0547                 HC0572                 HC0701 
# 103                     36                    123                     12 
# HC0732                  SA131                  SA132                  SA138 
# 45                      7                     83                      2 
# SA139                  SA144                  SA145                  SA146 
# 48                    197                     35                    225 
# SA151                  SA159                  SA160                  SA161 
# 26                     42                      9                     82 
# SA166                  SA168                  SA172                  SA174 
# 60                    218                    180                     87 
# SA220                  SA222                  SA225                  SA227 
# 4                     48                     16                     77 
# SA271                  SA272                  SA276 
# 162                    115                    194 
table(DC.sce$group_id) 
#output:
# Healthy       Active RA RA in Remission 
# 319            1380             537 
table(DC.sce$cluster_id) 
#output:
# A    B 
# 1454  782 


#prepsim

ref <- prepSim(DC.sce, verbose = TRUE)
#output:
# Argument `group_keep` unspecified; defaulting to retaining “Healthy”-group samples.
# Filtering...
# - 4610/27248 genes and 319/2236 cells retained.
# - 1/2 subpopulations and 2/5 samples retained.
# Error in getS3method(f, signature, optional = TRUE) : 
#   length(class) == 1L is not TRUE

ref <- prepSim(DC.sce, group_keep="Healthy", verbose = TRUE)
#output:
# Filtering...
# - 4610/27248 genes and 319/2236 cells retained.
# - 1/2 subpopulations and 2/5 samples retained.
# Error in getS3method(f, signature, optional = TRUE) : 
#   length(class) == 1L is not TRUE

traceback()
#output:
# 13: stop(simpleError(msg, call = if (p <- sys.parent(1L)) sys.call(p)))
# 12: stopifnot(is.character(class), length(class) == 1L)
# 11: getS3method(f, signature, optional = TRUE)
# 10: hasS3Method("droplevels", class(xi))
# 9: FUN(X[[i]], ...)
# 8: vapply(x, canDropLevels, NA)
# 7: .local(x, ...)
# 6: droplevels(x, ...)
# 5: droplevels(x, ...)
# 4: droplevels.List(colData(x))
# 3: droplevels(colData(x))
# 2: as.data.frame(droplevels(colData(x)))
# 1: prepSim(DC.sce, verbose = TRUE)


#####---2.try create sce object, instead of convert it--------##

DefaultAssay(DC) <- "RNA"
counts <- DC@assays$RNA@counts
DC.sce <- SingleCellExperiment(list(counts=counts))

#add required parameter
DC.sce$sample_id <- as.factor(DC$sample)
DC.sce$group_id <- as.factor(DC$group)
DC.sce$cluster_id <- as.factor(DC$DCType)

#Sce object 

DC.sce 
#output:
# class: SingleCellExperiment 
# dim: 27248 2236 
# metadata(0):
#   assays(1): ''
# rownames(27248): AL627309.1 AL669831.5 ... DUOXA2 RP11-58A18.2
# rowData names(0):
#   colnames(2236): ACAGCCGGTATGGAAT_1_1 ACGGTCGGTATTGACC_1_1 ... SA225_TGACTTTTCACAATGC_2
# SA225_TGACTTTTCCTTGCCA_2
# colData names(3): sample_id group_id cluster_id
# reducedDimNames(0):
#   mainExpName: NULL
# altExpNames(0):
table(DC.sce$sample_id) 
#output:
# 10K PBMCs 10x Genomics                 HC0547                 HC0572                 HC0701 
# 103                     36                    123                     12 
# HC0732                  SA131                  SA132                  SA138 
# 45                      7                     83                      2 
# SA139                  SA144                  SA145                  SA146 
# 48                    197                     35                    225 
# SA151                  SA159                  SA160                  SA161 
# 26                     42                      9                     82 
# SA166                  SA168                  SA172                  SA174 
# 60                    218                    180                     87 
# SA220                  SA222                  SA225                  SA227 
# 4                     48                     16                     77 
# SA271                  SA272                  SA276 
#162                    115                    194 
table(DC.sce$group_id) 
#output:
# Healthy       Active RA RA in Remission 
# 319            1380             537 
table(DC.sce$cluster_id) 
#output:
# A    B 
# 1454  782 

#prepsim
prepSim(DC.sce,verbose = TRUE)
#output:
# Argument `group_keep` unspecified; defaulting to retaining “Healthy”-group samples.
# Filtering...
# - 4610/27248 genes and 319/2236 cells retained.
# - 1/2 subpopulations and 2/5 samples retained.
# Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
#   contrasts can be applied only to factors with 2 or more levels

traceback()
#output:
# 9: h(simpleError(msg, call))
# 8: .handleSimpleError(function (cond) 
#   .Internal(C_tryCatchHelper(addr, 1L, cond)), "'assay(<SingleCellExperiment>, i=\"character\", ...)' invalid subscript 'i'\n'counts' not in names(assays(<SingleCellExperiment>))", 
#   base::quote(assay(object, i = exprs_values, ...)))
# 7: stop(msg, "\n'", i, "' not in names(assays(<", class(x), ">))")
# 6: assay(object, i = exprs_values, ...)
# 5: assay(object, i = exprs_values, ...)
# 4: counts(x)
# 3: counts(x)
# 2: rowSums(counts(x) > min_count)
# 1: prepSim(DC.sce, verbose = TRUE)



#####-----3.try replace cluster_id with ctrl-------####3

DC = readRDS("./dataForMuscat/newCD1c.rds")
DC_Diet <- DietSeurat(DC)
DefaultAssay(DC_Diet) <- "RNA"
DC.sce <- as.SingleCellExperiment(DC_Diet)

#replace cluster_id

table(DC.sce$group)
#        Healthy       Active RA RA in Remission 
#319            1380             537 

df1 <- DC.sce$group %>% str_replace("Healthy","ctrl")
df1 <- df1 %>% str_replace("RA in Remission","Remission")
df1 <- df1 %>% str_replace("Active RA","RA")
DC.sce$group <- as.factor(df1)

table(DC.sce$group)
#     ctrl        RA Remission 
#319      1380       537 

#add required parameter
DC.sce$sample_id <- as.factor(DC.sce$sample)
DC.sce$group_id <- as.factor(DC.sce$group)
DC.sce$cluster_id <- as.factor(DC.sce$DCType)

#Sce object 

DC.sce 
#output:
# class: SingleCellExperiment 
# dim: 27248 2236 
# metadata(0):
#   assays(2): counts logcounts
# rownames(27248): AL627309.1 AL669831.5 ... DUOXA2 RP11-58A18.2
# rowData names(0):
#   colnames(2236): ACAGCCGGTATGGAAT_1_1 ACGGTCGGTATTGACC_1_1 ... SA225_TGACTTTTCACAATGC_2
# SA225_TGACTTTTCCTTGCCA_2
# colData names(23): orig.ident nCount_RNA ... group_id cluster_id
# reducedDimNames(0):
#   mainExpName: RNA
# altExpNames(1): completeIntegration

table(DC.sce$sample_id) 
#output:
# 10K PBMCs 10x Genomics                 HC0547                 HC0572                 HC0701 
# 103                     36                    123                     12 
# HC0732                  SA131                  SA132                  SA138 
# 45                      7                     83                      2 
# SA139                  SA144                  SA145                  SA146 
# 48                    197                     35                    225 
# SA151                  SA159                  SA160                  SA161 
# 26                     42                      9                     82 
# SA166                  SA168                  SA172                  SA174 
# 60                    218                    180                     87 
# SA220                  SA222                  SA225                  SA227 
# 4                     48                     16                     77 
# SA271                  SA272                  SA276 
# 162                    115                    194

table(DC.sce$group_id) 
#output:
# ctrl        RA Remission 
# 319      1380       537 
table(DC.sce$cluster_id) 
#output:
# A    B 
# 1454  782 

#prepsim
prepSim(DC.sce,verbose = TRUE)
#output:
# Argument `group_keep` unspecified; defaulting to retaining “ctrl”-group samples.
# Filtering...
# - 4610/27248 genes and 319/2236 cells retained.
# - 1/2 subpopulations and 2/5 samples retained.
# Error in getS3method(f, signature, optional = TRUE) : 
#   length(class) == 1L is not TRUE

traceback()
#output:
# 13: stop(simpleError(msg, call = if (p <- sys.parent(1L)) sys.call(p)))
# 12: stopifnot(is.character(class), length(class) == 1L)
# 11: getS3method(f, signature, optional = TRUE)
# 10: hasS3Method("droplevels", class(xi))
# 9: FUN(X[[i]], ...)
# 8: vapply(x, canDropLevels, NA)
# 7: .local(x, ...)
# 6: droplevels(x, ...)
# 5: droplevels(x, ...)
# 4: droplevels.List(colData(x))
# 3: droplevels(colData(x))
# 2: as.data.frame(droplevels(colData(x)))
# 1: prepSim(DC.sce, verbose = TRUE)


########---------4.pbmc Seurat dataset-----------#######
#I thought maybe it’s a problem with my Seurat object, so I tried 
#a dataset from official Seurat vignette: 
#https://satijalab.org/seurat/articles/integration_introduction.html 

# install dataset
InstallData("ifnb")

# load dataset
LoadData("ifnb")

# split the dataset into a list of two seurat objects (stim and CTRL)
ifnb.list <- SplitObject(ifnb, split.by = "stim")

# normalize and identify variable features for each dataset independently
ifnb.list <- lapply(X = ifnb.list, FUN = function(x) {
  x <- NormalizeData(x)
  x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000)
})

#Create a dataset with 2 condition 
pbmc2 <- merge(ifnb.list$CTRL, ifnb.list$STIM)

table(pbmc2$stim) 
#CTRL STIM  
#6548 7451  

pbmc2.sce <- as.SingleCellExperiment(pbmc2)

#add required parameters
pbmc2.sce$sample_id <- as.factor(pbmc2.sce$orig.ident)
pbmc2.sce$group_id <- as.factor(pbmc2.sce$stim)
pbmc2.sce$cluster_id <- as.factor(pbmc2.sce$seurat_annotations)

#Sce object 

pbmc2.sce 
#output:
# class: SingleCellExperiment 
# dim: 14053 13999 
# metadata(0):
#   assays(2): counts logcounts
# rownames(14053): AL627309.1 RP11-206L10.2 ... AP001062.7 LRRC3DN
# rowData names(0):
#   colnames(13999): AAACATACATTTCC.1 AAACATACCAGAAA.1 ... TTTGCATGCTAAGC.1 TTTGCATGGGACGA.1
# colData names(9): orig.ident nCount_RNA ... group_id cluster_id
# reducedDimNames(0):
#   mainExpName: RNA
# altExpNames(0):

table(pbmc2.sce$sample_id) 
#output:
# IMMUNE_CTRL IMMUNE_STIM 
# 6548        7451 
table(pbmc2.sce $group_id)
#output:
# CTRL STIM 
# 6548 7451 
table(pbmc2.sce $cluster_id) 
#output:
# B  B Activated    CD14 Mono    CD16 Mono CD4 Memory T  CD4 Naive T        CD8 T 
# 978          388         4362         1044         1762         2504          814 
# DC        Eryth           Mk           NK          pDC  T activated 
# 472           55          236          619          132          633 

#prepsim
ref <- prepSim(pbmc2.sce, verbose = TRUE) 
#output:
# Argument `group_keep` unspecified; defaulting to retaining “CTRL”-group samples.
# Filtering...
# - 4901/14053 genes and 6548/13999 cells retained.
# Error in prepSim(pbmc2.sce, verbose = TRUE) : 
#   Current 'min_size' retains only 1 sample,
# but mean-dispersion estimation requires at least 2.

traceback()
#output:
# 2: stop("Current 'min_size' retains only 1 sample,\nbut", " mean-dispersion estimation requires at least 2.")
# 1: prepSim(pbmc2.sce, verbose = TRUE)

Thank you for your patience!

Sincerely,
Yiyi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants