Skip to content

Commit

Permalink
Bug fix for large transcriptomes: uint16 overflow (#53)
Browse files Browse the repository at this point in the history
* Gene index datatype uint16 was too small

There were problems in datasets with more than 2^16 genes where the gene index (used to create the sparse output matrix) would overflow, causing a mangled output.
  • Loading branch information
sjfleming authored Feb 5, 2020
1 parent d68bf9d commit 20bab46
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions cellbender/remove_background/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -653,13 +653,13 @@ def generate_maximum_a_posteriori_count_matrix(

# Append these to their lists.
barcodes.extend(nonzero_barcodes.astype(dtype=np.uint32))
genes.extend(nonzero_genes.astype(dtype=np.uint16))
genes.extend(nonzero_genes.astype(dtype=np.uint32))
counts.extend(nonzero_counts.astype(dtype=np.uint32))

# Convert the lists to numpy arrays.
counts = np.array(counts, dtype=np.uint32)
barcodes = np.array(barcodes, dtype=np.uint32)
genes = np.array(genes, dtype=np.uint16)
genes = np.array(genes, dtype=np.uint32)

# Put the counts into a sparse csc_matrix.
inferred_count_matrix = sp.csc_matrix((counts, (barcodes, genes)),
Expand Down

0 comments on commit 20bab46

Please sign in to comment.