Releases · scverse/scirpy

09 Jun 06:10

grst

v0.13.0

ec9b894

v0.13.0 - new data structure based on awkward arrays

This release introduces a new way to store AIRR data in AnnData.obsm using awkward arrays.
This change entails several backwards-incompatible changes to the scirpy workflow.

Please read the release notes for more details.
For more information about the new data structure, please see the respective section in the documentation.

Assets 2

26 Apr 04:58

grst

v0.12.2

678d0ca

v0.12.2

Fix IEDB data loader after update of IEDB data formats (backport of #401)

Assets 2

07 Apr 06:58

grst

v0.13.0rc1

d8ec147

v0.13.0rc1 - new data structure based on awkward arrays Pre-release

Pre-release

This update introduces a new datastructure based on awkward arrays.
The new datastructure is described in more detail in the documentation and is considered the "official" way of representing AIRR data for scverse core and ecosystem packages.

Benefits of the new data structure include:

a more natural, lossless representation of AIRR Rearrangement data
separation of AIRR data and the receptor model, thereby getting rid of previous limitations (e.g. "only productive chains") and enabling other use-cases (e.g. spatial AIRR data) in the future.
clean adata.obs as AIRR data is not expanded into columns
support for MuData for working with paired gene expression and AIRR data as separate modalities.

The overall workflow stays the same, however this update required several backwards-incompatible changes which are summarized below.

Backwards-incompatible changes

New data structure

Closes issue #327.

Changed behavior:

there are no "has_ir" and "multichain" columns in adata.obs anymore
By default all fields are imported from AIRR rearrangement and 10x data.
The restriction that all chains added to an AirrCell must have the same fields has been removed. Missing fields are automatically filled with missing values.
io.upgrade_schema can update from v0.7 to v0.13 schema. AnnData objects generated with scirpy <= 0.6.x cannot be read anymore.
pl.spectratype now has a chain attributed and the meaning of the cdr3_col attribute has changed.

New functions:

pp.index_chains
pp.merge_chains

Removed functions:

pp.merge_with_ir
pp.merge_airr_chains

API supporting MuData

Closes issue #383

All functions take (where applicable) the additional, optional keyword arguments

airr_mod: the modality in MuData that contains AIRR information (default: "airr")
airr_key: the slot in adata.obsm that contains AIRR rearrangement data (default: "airr")
chain_idx_key: the slot in adata.obsm that contains indices specifying which chains in adata.obsm[airr_key] are the primary/secondary chains etc.

New class:

util.DataHandler

Updated example datasets

The example datasets have been updated to be based on the new datastructure and are now based on MuData.

The example datasets have been regenerated from scratch using the loader notebooks described in the docstring. The Maynard dataset gene expression is now based on values generated with Salmon instead of RSEM/featurecounts.
Scirpy now uses pooch to manage example datasets.

Cleanup

Removed the deprecated functions io.from_tcr_objs, io.from_ir_objs, io.to_ir_objs, pp.merge_with_tcr, pp.tcr_neighbors, pp.ir_neighbors, tl.chain_pairing
Removed the deprecated classes TcrCell, AirrChain, TcrChain
Removed the function pl.cdr_convergence which was never public anyway.

Additions

Easy-access functions (`scirpy.get`)

Closes issue #184

New functions:

get.airr
get.obs_context
get.airr_context

Fixes

Several type hints that were previously inaccurate are now updated.
Fix x-axis labelling in pl.clonotype_overlap raises an error if row annotations are not unique for each group.

Documentation

The documentation has been updated to reflect the changes described above, in particular the tutorials and the page about the data structure.

Other changes

The minimum required Python version is now 3.8 (#381)
Increased the minium version of tqdm to 4.63 (See tqdm/tqdm#1082)
pl.repertoire_overlap now always runs tl.repertoire_overlap internally and doesn't rely on cached values.
The mode dendro_only in pl.repertoire_overlap has been removed.
Cells that have a receptor, but no CDR3 sequence have previously received a separate clonotype in tl.define_clonotypes. Now they are receiving no clonotype (i.e. np.nan) as do cells without a receptor.
The function tl.clonal_expansion now returns a pd.Series instead of a np.array with inplace=False
Removed deprecation for clonotype_imbalanced, see #330
The group_abundance tool and plotting function used has_ir as a default group as we could previously rely on this column being present. With the new datastructure, this is not the case. To no break old code, the has_ir column is tempoarily added when requested. The group_abundance function will have to be rewritten enitrely in the future, see #232
In pl.spectratype, the parameter groupby has been replaced by chain.
We now use isort to organize imports.
Static typing has been improved internally (using pylance). It's not perfectly consistent yet, but we will keep working on this in the future.

Assets 2

07 Apr 06:05

grst

v0.12.1

c56897c

v0.12.1

Fixes

Bump min Python version to 3.8; CI update by @grst in #381
Temporarily pin pandas < 2 in #390

Other Changes

update pre-commit CI

Contributors

grst

Assets 2

27 Jan 13:50

grst

v0.12.0

ba45166

v0.12.0

New Features

Download IEDB and process it into an AnnData object by @ausserh in #377

Fixes

Fix working with subplots (#378) by @grst in #379

Documentation

Fix typos in IR query by @Zethson in #374
Fix a bunch of typos in the docs by @grst in #375

Internal changes

Fix CI by @grst in #376

New Contributors

@Zethson made their first contribution in #374
@ausserh made their first contribution in #377

Full Changelog: v0.11.2...v0.12.0

Contributors

grst, Zethson, and ausserh

Assets 2

20 Nov 12:36

grst

v0.11.2

5b381ad

v0.11.2

Fixes

Excluded broken python-igraph version (#366)

Assets 2

18 Aug 12:56

grst

v0.11.1

831f817

v0.11.1

Fixes

Solve incompatibility with scipy v1.9.0 (#360)

Internal changes

do not autodeploy docs via CI (currently broken)
updated patched version of scikit-learn

Assets 2

05 Jul 09:55

grst

v0.11.0

2c23901

v0.11.0

Additions

Add data loader for BD Rhapsody single-cell immune-cell receptor data (io.read_bd_rhapsody) (#351)

Fixes

Fix type conversions in from_dandelion (#349).
Update minimal dandelion version

Documentation

Rebranding to scverse (#324, #326)
Add issue templates
Fix IMGT typos (#344 by @emjbishop)

Internal changes

Bump default CI python version to 3.9
Use patched version of scikit-bio in CI until scikit-bio/scikit-bio#1813 gets merged

Contributors

emjbishop

Assets 2

22 Nov 09:43

grst

v0.10.1

c8cf8e9

v0.10.1

Fixes

Fix bug in cellranger import (#310 by @ddemaeyer)
Fix that VDJDB download failed when cache dir was not present (#311)

Contributors

ddemaeyer

Assets 2

15 Nov 19:51

grst

v0.10.0

76233c5

v0.10.0

Additions

This release adds a new feature to query reference databases (#298) comprising

an extension of pp.ir_dist to compute distances to a reference dataset,
tl.ir_query, to match immune receptors to a reference database based on the distances computed with ir_dist,
tl.ir_query_annotate and tl.ir_query_annotate_df to annotate cells based on the result of tl.ir_query, and
datasets.vdjdb which conveniently downloads and processes the latest version of VDJDB.

Fixes

Bump minimal dependencies for networkx and tqdm (#300)
Fix issue with repertoire_overlap (Fix #302 via #305)
Fix issue with define_clonotype_clusters (Fix #303 via #305)
Suppress FutureWarnings from pandas in tutorials (#307)

Internal changes

Update sphinx to >= 4.1 (#306)
Update black version
Update the internal folder structure: tl, pp etc. are now real packages instead of aliases

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly