-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: need at least one array to concatenate #7
Comments
I am also getting the same error, running: ValueError: need at least one array to concatenate
|
Hi, thanks for reporting, I'll look into it. |
Hi Gregor - thanks for looking into this! Stacktrace below: /venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/anndata/c ompat/__init__.py:180: FutureWarning: Moving element from .uns['neighbors']['dis tances'] to .obsp['distances'].
This is where adjacency matrices should go now.
warn(
INFO:root:Extracted GTF attributes: ['gene_id', 'gene_name']
WARNING: GTF file misses annotation for 13 genes in adata.
WARNING: Skipped 13 genes because of duplicate identifiers in GTF file.
... storing 'orig.ident' as categorical
... storing 'Phase' as categorical
... storing 'seurat_clusters' as categorical
... storing 'condition' as categorical
... storing 'patientID' as categorical
... storing 'DF' as categorical
... storing 'chromosome' as categorical
... storing 'gene_id' as categorical
... storing 'gene_name' as categorical
WARNING: Skipped 26 genes because they don't have a genomic position annotated.
/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/numpy/lib /arraysetops.py:583: FutureWarning: elementwise comparison failed; returning sca lar instead, but in the future will perform elementwise comparison
mask |= (ar1 == a)
Traceback (most recent call last):
File "run_infercnv.py", line 18, in <module>
infercnvpy.tl.infercnv(seu_epi, reference_key='seurat_clusters', reference_c at=[0,1], window_size=200)
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/i nfercnvpy/tl/_infercnv.py", line 164, in infercnv
reference = _get_reference(tmp_adata, reference_key, reference_cat, referenc e)
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/i nfercnvpy/tl/_infercnv.py", line 305, in _get_reference
raise ValueError(
ValueError: The following reference categories were not found in adata.obs[refer ence_key]: [0 1]
(/venvs/anaconda/envs/groupenvs/infercnv-py) sgrimes@ikura:C03_210423_inferCNV_p yTest$ python run_infercnv.py
/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/anndata/c ompat/__init__.py:180: FutureWarning: Moving element from .uns['neighbors']['dis tances'] to .obsp['distances'].
This is where adjacency matrices should go now.
warn(
INFO:root:Extracted GTF attributes: ['gene_id', 'gene_name']
WARNING: GTF file misses annotation for 13 genes in adata.
WARNING: Skipped 13 genes because of duplicate identifiers in GTF file.
... storing 'orig.ident' as categorical
... storing 'Phase' as categorical
... storing 'seurat_clusters' as categorical
... storing 'condition' as categorical
... storing 'patientID' as categorical
... storing 'DF' as categorical
... storing 'chromosome' as categorical
... storing 'gene_id' as categorical
... storing 'gene_name' as categorical
WARNING: Skipped 26 genes because they don't have a genomic position annotated.
0%| | 0/2 [00:11<?, ?it/s]
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/concurrent/futures/process.py", line 198, in _process_chunk
return [fn(*args) for args in chunk]
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/concurrent/futures/process.py", line 198, in <listcomp>
return [fn(*args) for args in chunk]
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/infercnvpy/tl/_infercnv.py", line 353, in _infercnv_chunk
chr_pos, x_smoothed = _running_mean_by_chromosome(
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/infercnvpy/tl/_infercnv.py", line 273, in _running_mean_by_chromosome
return chr_start_pos, np.hstack(running_means)
File "<__array_function__ internals>", line 5, in hstack
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/numpy/core/shape_base.py", line 346, in hstack
return _nx.concatenate(arrs, 1)
File "<__array_function__ internals>", line 5, in concatenate
ValueError: need at least one array to concatenate
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "run_infercnv.py", line 18, in <module>
infercnvpy.tl.infercnv(seu_epi, reference_key='seurat_clusters', reference_cat=['0','1'], window_size=200)
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/infercnvpy/tl/_infercnv.py", line 169, in infercnv
*process_map(
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/tqdm/contrib/concurrent.py", line 130, in process_map
return _executor_map(ProcessPoolExecutor, fn, *iterables, **tqdm_kwargs)
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/tqdm/contrib/concurrent.py", line 76, in _executor_map
return list(tqdm_class(ex.map(fn, *iterables, **map_args), **kwargs))
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/site-packages/tqdm/std.py", line 1178, in __iter__
for obj in iterable:
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists
for element in iterable:
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/concurrent/futures/_base.py", line 611, in result_iterator
yield fs.pop().result()
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/venvs/anaconda/envs/groupenvs/infercnv-py/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
ValueError: need at least one array to concatenate |
In case it factors into the troubleshooting, my AnnData object started out as a Seurat 3.x object, updated to Seurat 4.0, then converted to hd5, then to AnnData. |
Can you please post the result of adata.obs["chromosome"].unique() right before running |
I assume you mean adata.var["chromosome"].unique()? Results below. I see that there is a nan entry, probably due to the duplicate genes being removed. Is that the issue? seu_epi.var["chromosome"].unique() |
Thanks! The problem is that I expected all chromosomes (as opposed to scaffolds) to start with infercnvpy/infercnvpy/tl/_infercnv.py Lines 255 to 257 in c5bd39b
What GTF file are you using? I only tried with GENCODE I think. |
Ah, glad it's an easy explanation! I'm using the gtf file provided for cellranger by 10X genomics. I think it is ensembl. The first few lines below: #!genome-build GRCh38.p12 #!genome-version GRCh38 #!genome-date 2013-12 #!genome-build-accession NCBI:GCA_000001405.27 #!genebuild-last-updated 2018-01 1 havana gene 29554 31109 . + . gene_id "ENSG00000243485"; gene_version "5"; gene_name "MIR1302-2HG"; gene_source "havana"; gene_biotype "lincRNA" 1 havana transcript 29554 31097 . + . gene_id "ENSG00000243485"; gene_version "5"; transcript_id "ENST00000473358"; transcript_version "1"; gene_name "MIR1302-2HG"; gene_source "havana"; gene_biotype "lincRNA"; transcript_name "MIR1302-2HG-202"; transcript_source "havana"; transcript_biotype "lincRNA"; tag "basic"; transcript_support_level "5" 1 havana exon 29554 30039 . + . gene_id "ENSG00000243485"; gene_version "5"; transcript_id "ENST00000473358"; transcript_version "1"; exon_number "1"; gene_name "MIR1302-2HG"; gene_source "havana"; gene_biotype "lincRNA"; transcript_name "MIR1302-2HG-202"; transcript_source "havana"; transcript_biotype "lincRNA"; exon_id "ENSE00001947070"; exon_version "1"; tag "basic"; transcript_support_level "5" 1 havana exon 30564 30667 . + . gene_id "ENSG00000243485"; gene_version "5"; transcript_id "ENST00000473358"; transcript_version "1"; exon_number "2"; gene_name "MIR1302-2HG"; gene_source "havana"; gene_biotype "lincRNA"; transcript_name "MIR1302-2HG-202"; transcript_source "havana"; transcript_biotype "lincRNA"; exon_id "ENSE00001922571"; exon_version "1"; tag "basic"; transcript_support_level "5" 1 havana exon 30976 31097 . + . gene_id "ENSG00000243485"; gene_version "5"; transcript_id "ENST00000473358"; transcript_version "1"; exon_number "3"; gene_name "MIR1302-2HG"; gene_source "havana"; gene_biotype "lincRNA"; transcript_name "MIR1302-2HG-202"; transcript_source "havana"; transcript_biotype "lincRNA"; exon_id "ENSE00001827679"; exon_version "1"; tag "basic"; transcript_support_level "5" 1 havana transcript 30267 31109 . + . gene_id "ENSG00000243485"; gene_version "5"; transcript_id "ENST00000469289"; transcript_version "1"; gene_name "MIR1302-2HG"; gene_source "havana"; gene_biotype "lincRNA"; transcript_name "MIR1302-2HG-201"; transcript_source "havana"; transcript_biotype "lincRNA"; tag "basic"; transcript_support_level "5" 1 havana exon 30267 30667 . + . gene_id "ENSG00000243485"; gene_version "5"; transcript_id "ENST00000469289"; transcript_version "1"; exon_number "1"; gene_name "MIR1302-2HG"; gene_source "havana"; gene_biotype "lincRNA"; transcript_name "MIR1302-2HG-201"; transcript_source "havana"; transcript_biotype "lincRNA"; exon_id "ENSE00001841699"; exon_version "1"; tag "basic"; transcript_support_level "5" |
Ok, I'll look into improving the In the meanwhile, you could try a GENCODE file, or just modify the |
ok got it. I will try one of those two things. Thanks for being so responsive looking into this! |
In the meantime a small and dirty trick: Then it worked for me. |
Hi,
I am trying to use infercnvpy on my single-cell dataset, but when I run:
cnv.tl.infercnv(
adata,
reference_key="leiden_1.0",
reference_cat='12',
)
I get the error message: ValueError: need at least one array to concatenate.
When I run:
adata.var.loc[:, ["gene_ids", "chromosome", "start", "end"]].head()
I can see that these columns were successfully added to adata.var
When I run:
adata.obs.head()
I can see that there is a column, 'leiden_1.0' with clusters numbered '0'-'16'.
I'm not sure what this error message means or what I can do differently to fix it.
Thank you
The text was updated successfully, but these errors were encountered: