error in dataset simulation: length(class) == 1L is not TRUE #79

YiyiChen0328 · 2021-07-31T11:05:45Z

Hi Muscat team,

I am trying to simulate a dataset, and when I try prepSim, this error occurs:

> > ref <- prepSim(DC.sce, verbose = TRUE) 
> Argument `group_keep` unspecified; defaulting to retaining “Healthy”-group samples.
> Filtering...
> - 4610/27248 genes and 319/2236 cells retained.
> - 1/2 subpopulations and 2/5 samples retained.
> Error in getS3method(f, signature, optional = TRUE) : 
>   length(class) == 1L is not TRUE

The Dc.sce was converted from a Seurat object and the code I am using is:

DC_Diet <- DietSeurat(DC)
DefaultAssay(DC_Diet) <- "RNA"
DC.sce <- as.SingleCellExperiment(DC_Diet)

I compared the differences between Dc.sce and the example sce, the biggest difference is the $metadata.

Could you please help me figure this out?

Many thanks,
Yiyi

R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8       
 [4] LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] dplyr_1.0.7                 SingleCellExperiment_1.14.1 SummarizedExperiment_1.22.0
 [4] Biobase_2.52.0              GenomicRanges_1.44.0        GenomeInfoDb_1.28.1        
 [7] IRanges_2.26.0              S4Vectors_0.30.0            BiocGenerics_0.38.0        
[10] MatrixGenerics_1.4.0        matrixStats_0.59.0          SeuratObject_4.0.2         
[13] Seurat_4.0.3                muscat_1.7.1               

loaded via a namespace (and not attached):
  [1] utf8_1.2.1                reticulate_1.20           tidyselect_1.1.1         
  [4] lme4_1.1-27.1             RSQLite_2.2.7             AnnotationDbi_1.54.1     
  [7] htmlwidgets_1.5.3         grid_4.1.0                BiocParallel_1.26.1      
 [10] Rtsne_0.15                devtools_2.4.2            munsell_0.5.0            
 [13] ScaledMatrix_1.0.0        codetools_0.2-18          ica_1.0-2                
 [16] future_1.21.0             miniUI_0.1.1.1            withr_2.4.2              
 [19] colorspace_2.0-2          rstudioapi_0.13           ROCR_1.0-11              
 [22] tensor_1.5                listenv_0.8.0             GenomeInfoDbData_1.2.6   
 [25] polyclip_1.10-0           bit64_4.0.5               rprojroot_2.0.2          
 [28] glmmTMB_1.1.2             parallelly_1.26.1         vctrs_0.3.8              
 [31] generics_0.1.0            R6_2.5.0                  doParallel_1.0.16        
 [34] ggbeeswarm_0.6.0          clue_0.3-59               rsvd_1.0.5               
 [37] locfit_1.5-9.4            spatstat.utils_2.2-0      bitops_1.0-7             
 [40] cachem_1.0.5              DelayedArray_0.18.0       assertthat_0.2.1         
 [43] promises_1.2.0.1          scales_1.1.1              beeswarm_0.4.0           
 [46] gtable_0.3.0              beachmat_2.8.0            Cairo_1.5-12.2           
 [49] globals_0.14.0            processx_3.5.2            goftest_1.2-2            
 [52] rlang_0.4.11              genefilter_1.74.0         GlobalOptions_0.1.2      
 [55] splines_4.1.0             TMB_1.7.20                lazyeval_0.2.2           
 [58] spatstat.geom_2.2-0       broom_0.7.8               BiocManager_1.30.16      
 [61] abind_1.4-5               reshape2_1.4.4            backports_1.2.1          
 [64] httpuv_1.6.1              usethis_2.0.1             tools_4.1.0              
 [67] ggplot2_3.3.5             spatstat.core_2.2-0       ellipsis_0.3.2           
 [70] gplots_3.1.1              RColorBrewer_1.1-2        sessioninfo_1.1.1        
 [73] ggridges_0.5.3            Rcpp_1.0.7                plyr_1.8.6               
 [76] sparseMatrixStats_1.4.0   progress_1.2.2            zlibbioc_1.38.0          
 [79] purrr_0.3.4               RCurl_1.98-1.3            ps_1.6.0                 
 [82] prettyunits_1.1.1         rpart_4.1-15              deldir_0.2-10            
 [85] pbapply_1.4-3             GetoptLong_1.0.5          viridis_0.6.1            
 [88] cowplot_1.1.1             zoo_1.8-9                 ggrepel_0.9.1            
 [91] cluster_2.1.2             fs_1.5.0                  colorRamps_2.3           
 [94] variancePartition_1.22.0  magrittr_2.0.1            data.table_1.14.0        
 [97] scattermore_0.7           lmerTest_3.1-3            circlize_0.4.13          
[100] lmtest_0.9-38             RANN_2.6.1                fitdistrplus_1.1-5       
[103] pkgload_1.2.1             hms_1.1.0                 patchwork_1.1.1          
[106] mime_0.11                 xtable_1.8-4              pbkrtest_0.5.1           
[109] XML_3.99-0.6              gridExtra_2.3             shape_1.4.6              
[112] testthat_3.0.4            compiler_4.1.0            scater_1.20.1            
[115] tibble_3.1.2              KernSmooth_2.23-20        crayon_1.4.1             
[118] minqa_1.2.4               htmltools_0.5.1.1         mgcv_1.8-36              
[121] later_1.2.0               tidyr_1.1.3               geneplotter_1.70.0       
[124] DBI_1.1.1                 ComplexHeatmap_2.8.0      MASS_7.3-54              
[127] boot_1.3-28               Matrix_1.3-4              cli_3.0.0                
[130] igraph_1.2.6              pkgconfig_2.0.3           numDeriv_2016.8-1.1      
[133] spatstat.sparse_2.0-0     plotly_4.9.4.1            scuttle_1.2.0            
[136] foreach_1.5.1             annotate_1.70.0           vipor_0.4.5              
[139] blme_1.0-5                XVector_0.32.0            callr_3.7.0              
[142] stringr_1.4.0             digest_0.6.27             sctransform_0.3.2        
[145] RcppAnnoy_0.0.18          spatstat.data_2.1-0       Biostrings_2.60.1        
[148] leiden_0.3.8              uwot_0.1.10               edgeR_3.34.0             
[151] DelayedMatrixStats_1.14.0 shiny_1.6.0               gtools_3.9.2             
[154] rjson_0.2.20              nloptr_1.2.2.2            lifecycle_1.0.0          
[157] nlme_3.1-152              jsonlite_1.7.2            BiocNeighbors_1.10.0     
[160] desc_1.3.0                viridisLite_0.4.0         limma_3.48.1             
[163] fansi_0.5.0               pillar_1.6.1              lattice_0.20-44          
[166] pkgbuild_1.2.0            KEGGREST_1.32.0           fastmap_1.1.0            
[169] httr_1.4.2                survival_3.2-11           remotes_2.4.0            
[172] glue_1.4.2                png_0.1-7                 iterators_1.0.13         
[175] bit_4.0.4                 stringi_1.6.2             blob_1.2.1               
[178] DESeq2_1.32.0             BiocSingular_1.8.1        caTools_1.18.2           
[181] memoise_2.0.0             irlba_2.3.3               future.apply_1.7.0

The text was updated successfully, but these errors were encountered:

HelenaLC · 2021-08-04T06:23:04Z

The metadata is not used at any point by prepSim(), so this cannot be causing the issue. Could you run a traceback() after the error occurs as to pin-point what might be causing the issue?

YiyiChen0328 · 2021-08-10T06:04:50Z

Hi Helena,

Thank you for response!

#####----------1.convert Seurat object--------####

DC = readRDS("./dataForMuscat/newCD1c.rds")
DC_Diet <- DietSeurat(DC)
DefaultAssay(DC_Diet) <- "RNA"
DC.sce <- as.SingleCellExperiment(DC_Diet)

#required parameter as.factor 
DC.sce$sample_id <- as.factor(DC.sce$sample)
DC.sce$group_id <- as.factor(DC.sce$group)
DC.sce$cluster_id <- as.factor(DC.sce$DCType)

#Sce object 
DC.sce 
#output:
# class: SingleCellExperiment 
# dim: 27248 2236 
# metadata(0):
#   assays(2): counts logcounts
# rownames(27248): AL627309.1 AL669831.5 ... DUOXA2 RP11-58A18.2
# rowData names(0):
#   colnames(2236): ACAGCCGGTATGGAAT_1_1 ACGGTCGGTATTGACC_1_1 ... SA225_TGACTTTTCACAATGC_2
# SA225_TGACTTTTCCTTGCCA_2
# colData names(23): orig.ident nCount_RNA ... group_id cluster_id
# reducedDimNames(0):
#   mainExpName: RNA
# altExpNames(1): completeIntegration
table(DC.sce$sample_id) 
#output:
# 10K PBMCs 10x Genomics                 HC0547                 HC0572                 HC0701 
# 103                     36                    123                     12 
# HC0732                  SA131                  SA132                  SA138 
# 45                      7                     83                      2 
# SA139                  SA144                  SA145                  SA146 
# 48                    197                     35                    225 
# SA151                  SA159                  SA160                  SA161 
# 26                     42                      9                     82 
# SA166                  SA168                  SA172                  SA174 
# 60                    218                    180                     87 
# SA220                  SA222                  SA225                  SA227 
# 4                     48                     16                     77 
# SA271                  SA272                  SA276 
# 162                    115                    194 
table(DC.sce$group_id) 
#output:
# Healthy       Active RA RA in Remission 
# 319            1380             537 
table(DC.sce$cluster_id) 
#output:
# A    B 
# 1454  782 


#prepsim

ref <- prepSim(DC.sce, verbose = TRUE)
#output:
# Argument `group_keep` unspecified; defaulting to retaining “Healthy”-group samples.
# Filtering...
# - 4610/27248 genes and 319/2236 cells retained.
# - 1/2 subpopulations and 2/5 samples retained.
# Error in getS3method(f, signature, optional = TRUE) : 
#   length(class) == 1L is not TRUE

ref <- prepSim(DC.sce, group_keep="Healthy", verbose = TRUE)
#output:
# Filtering...
# - 4610/27248 genes and 319/2236 cells retained.
# - 1/2 subpopulations and 2/5 samples retained.
# Error in getS3method(f, signature, optional = TRUE) : 
#   length(class) == 1L is not TRUE

traceback()
#output:
# 13: stop(simpleError(msg, call = if (p <- sys.parent(1L)) sys.call(p)))
# 12: stopifnot(is.character(class), length(class) == 1L)
# 11: getS3method(f, signature, optional = TRUE)
# 10: hasS3Method("droplevels", class(xi))
# 9: FUN(X[[i]], ...)
# 8: vapply(x, canDropLevels, NA)
# 7: .local(x, ...)
# 6: droplevels(x, ...)
# 5: droplevels(x, ...)
# 4: droplevels.List(colData(x))
# 3: droplevels(colData(x))
# 2: as.data.frame(droplevels(colData(x)))
# 1: prepSim(DC.sce, verbose = TRUE)


#####---2.try create sce object, instead of convert it--------##

DefaultAssay(DC) <- "RNA"
counts <- DC@assays$RNA@counts
DC.sce <- SingleCellExperiment(list(counts=counts))

#add required parameter
DC.sce$sample_id <- as.factor(DC$sample)
DC.sce$group_id <- as.factor(DC$group)
DC.sce$cluster_id <- as.factor(DC$DCType)

#Sce object 

DC.sce 
#output:
# class: SingleCellExperiment 
# dim: 27248 2236 
# metadata(0):
#   assays(1): ''
# rownames(27248): AL627309.1 AL669831.5 ... DUOXA2 RP11-58A18.2
# rowData names(0):
#   colnames(2236): ACAGCCGGTATGGAAT_1_1 ACGGTCGGTATTGACC_1_1 ... SA225_TGACTTTTCACAATGC_2
# SA225_TGACTTTTCCTTGCCA_2
# colData names(3): sample_id group_id cluster_id
# reducedDimNames(0):
#   mainExpName: NULL
# altExpNames(0):
table(DC.sce$sample_id) 
#output:
# 10K PBMCs 10x Genomics                 HC0547                 HC0572                 HC0701 
# 103                     36                    123                     12 
# HC0732                  SA131                  SA132                  SA138 
# 45                      7                     83                      2 
# SA139                  SA144                  SA145                  SA146 
# 48                    197                     35                    225 
# SA151                  SA159                  SA160                  SA161 
# 26                     42                      9                     82 
# SA166                  SA168                  SA172                  SA174 
# 60                    218                    180                     87 
# SA220                  SA222                  SA225                  SA227 
# 4                     48                     16                     77 
# SA271                  SA272                  SA276 
#162                    115                    194 
table(DC.sce$group_id) 
#output:
# Healthy       Active RA RA in Remission 
# 319            1380             537 
table(DC.sce$cluster_id) 
#output:
# A    B 
# 1454  782 

#prepsim
prepSim(DC.sce,verbose = TRUE)
#output:
# Argument `group_keep` unspecified; defaulting to retaining “Healthy”-group samples.
# Filtering...
# - 4610/27248 genes and 319/2236 cells retained.
# - 1/2 subpopulations and 2/5 samples retained.
# Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
#   contrasts can be applied only to factors with 2 or more levels

traceback()
#output:
# 9: h(simpleError(msg, call))
# 8: .handleSimpleError(function (cond) 
#   .Internal(C_tryCatchHelper(addr, 1L, cond)), "'assay(<SingleCellExperiment>, i=\"character\", ...)' invalid subscript 'i'\n'counts' not in names(assays(<SingleCellExperiment>))", 
#   base::quote(assay(object, i = exprs_values, ...)))
# 7: stop(msg, "\n'", i, "' not in names(assays(<", class(x), ">))")
# 6: assay(object, i = exprs_values, ...)
# 5: assay(object, i = exprs_values, ...)
# 4: counts(x)
# 3: counts(x)
# 2: rowSums(counts(x) > min_count)
# 1: prepSim(DC.sce, verbose = TRUE)



#####-----3.try replace cluster_id with ctrl-------####3

DC = readRDS("./dataForMuscat/newCD1c.rds")
DC_Diet <- DietSeurat(DC)
DefaultAssay(DC_Diet) <- "RNA"
DC.sce <- as.SingleCellExperiment(DC_Diet)

#replace cluster_id

table(DC.sce$group)
#        Healthy       Active RA RA in Remission 
#319            1380             537 

df1 <- DC.sce$group %>% str_replace("Healthy","ctrl")
df1 <- df1 %>% str_replace("RA in Remission","Remission")
df1 <- df1 %>% str_replace("Active RA","RA")
DC.sce$group <- as.factor(df1)

table(DC.sce$group)
#     ctrl        RA Remission 
#319      1380       537 

#add required parameter
DC.sce$sample_id <- as.factor(DC.sce$sample)
DC.sce$group_id <- as.factor(DC.sce$group)
DC.sce$cluster_id <- as.factor(DC.sce$DCType)

#Sce object 

DC.sce 
#output:
# class: SingleCellExperiment 
# dim: 27248 2236 
# metadata(0):
#   assays(2): counts logcounts
# rownames(27248): AL627309.1 AL669831.5 ... DUOXA2 RP11-58A18.2
# rowData names(0):
#   colnames(2236): ACAGCCGGTATGGAAT_1_1 ACGGTCGGTATTGACC_1_1 ... SA225_TGACTTTTCACAATGC_2
# SA225_TGACTTTTCCTTGCCA_2
# colData names(23): orig.ident nCount_RNA ... group_id cluster_id
# reducedDimNames(0):
#   mainExpName: RNA
# altExpNames(1): completeIntegration

table(DC.sce$sample_id) 
#output:
# 10K PBMCs 10x Genomics                 HC0547                 HC0572                 HC0701 
# 103                     36                    123                     12 
# HC0732                  SA131                  SA132                  SA138 
# 45                      7                     83                      2 
# SA139                  SA144                  SA145                  SA146 
# 48                    197                     35                    225 
# SA151                  SA159                  SA160                  SA161 
# 26                     42                      9                     82 
# SA166                  SA168                  SA172                  SA174 
# 60                    218                    180                     87 
# SA220                  SA222                  SA225                  SA227 
# 4                     48                     16                     77 
# SA271                  SA272                  SA276 
# 162                    115                    194

table(DC.sce$group_id) 
#output:
# ctrl        RA Remission 
# 319      1380       537 
table(DC.sce$cluster_id) 
#output:
# A    B 
# 1454  782 

#prepsim
prepSim(DC.sce,verbose = TRUE)
#output:
# Argument `group_keep` unspecified; defaulting to retaining “ctrl”-group samples.
# Filtering...
# - 4610/27248 genes and 319/2236 cells retained.
# - 1/2 subpopulations and 2/5 samples retained.
# Error in getS3method(f, signature, optional = TRUE) : 
#   length(class) == 1L is not TRUE

traceback()
#output:
# 13: stop(simpleError(msg, call = if (p <- sys.parent(1L)) sys.call(p)))
# 12: stopifnot(is.character(class), length(class) == 1L)
# 11: getS3method(f, signature, optional = TRUE)
# 10: hasS3Method("droplevels", class(xi))
# 9: FUN(X[[i]], ...)
# 8: vapply(x, canDropLevels, NA)
# 7: .local(x, ...)
# 6: droplevels(x, ...)
# 5: droplevels(x, ...)
# 4: droplevels.List(colData(x))
# 3: droplevels(colData(x))
# 2: as.data.frame(droplevels(colData(x)))
# 1: prepSim(DC.sce, verbose = TRUE)


########---------4.pbmc Seurat dataset-----------#######
#I thought maybe it’s a problem with my Seurat object, so I tried 
#a dataset from official Seurat vignette: 
#https://satijalab.org/seurat/articles/integration_introduction.html 

# install dataset
InstallData("ifnb")

# load dataset
LoadData("ifnb")

# split the dataset into a list of two seurat objects (stim and CTRL)
ifnb.list <- SplitObject(ifnb, split.by = "stim")

# normalize and identify variable features for each dataset independently
ifnb.list <- lapply(X = ifnb.list, FUN = function(x) {
  x <- NormalizeData(x)
  x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000)
})

#Create a dataset with 2 condition 
pbmc2 <- merge(ifnb.list$CTRL, ifnb.list$STIM)

table(pbmc2$stim) 
#CTRL STIM  
#6548 7451  

pbmc2.sce <- as.SingleCellExperiment(pbmc2)

#add required parameters
pbmc2.sce$sample_id <- as.factor(pbmc2.sce$orig.ident)
pbmc2.sce$group_id <- as.factor(pbmc2.sce$stim)
pbmc2.sce$cluster_id <- as.factor(pbmc2.sce$seurat_annotations)

#Sce object 

pbmc2.sce 
#output:
# class: SingleCellExperiment 
# dim: 14053 13999 
# metadata(0):
#   assays(2): counts logcounts
# rownames(14053): AL627309.1 RP11-206L10.2 ... AP001062.7 LRRC3DN
# rowData names(0):
#   colnames(13999): AAACATACATTTCC.1 AAACATACCAGAAA.1 ... TTTGCATGCTAAGC.1 TTTGCATGGGACGA.1
# colData names(9): orig.ident nCount_RNA ... group_id cluster_id
# reducedDimNames(0):
#   mainExpName: RNA
# altExpNames(0):

table(pbmc2.sce$sample_id) 
#output:
# IMMUNE_CTRL IMMUNE_STIM 
# 6548        7451 
table(pbmc2.sce $group_id)
#output:
# CTRL STIM 
# 6548 7451 
table(pbmc2.sce $cluster_id) 
#output:
# B  B Activated    CD14 Mono    CD16 Mono CD4 Memory T  CD4 Naive T        CD8 T 
# 978          388         4362         1044         1762         2504          814 
# DC        Eryth           Mk           NK          pDC  T activated 
# 472           55          236          619          132          633 

#prepsim
ref <- prepSim(pbmc2.sce, verbose = TRUE) 
#output:
# Argument `group_keep` unspecified; defaulting to retaining “CTRL”-group samples.
# Filtering...
# - 4901/14053 genes and 6548/13999 cells retained.
# Error in prepSim(pbmc2.sce, verbose = TRUE) : 
#   Current 'min_size' retains only 1 sample,
# but mean-dispersion estimation requires at least 2.

traceback()
#output:
# 2: stop("Current 'min_size' retains only 1 sample,\nbut", " mean-dispersion estimation requires at least 2.")
# 1: prepSim(pbmc2.sce, verbose = TRUE)

Thank you for your patience!

Sincerely,
Yiyi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error in dataset simulation: length(class) == 1L is not TRUE #79

error in dataset simulation: length(class) == 1L is not TRUE #79

YiyiChen0328 commented Jul 31, 2021

HelenaLC commented Aug 4, 2021

YiyiChen0328 commented Aug 10, 2021

error in dataset simulation: length(class) == 1L is not TRUE #79

error in dataset simulation: length(class) == 1L is not TRUE #79

Comments

YiyiChen0328 commented Jul 31, 2021

HelenaLC commented Aug 4, 2021

YiyiChen0328 commented Aug 10, 2021