Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement scverse datastucture (#356)
* Create Awkward AnnData instead of putting everything in obs * add todo * Get chain indices for primary and secondary chains * WIP get module * Implement ir.get.airr * Clean up AirrCell * WIP restructure IO module * fix imports * Add helper function for unit tests * tl.chain_qc successfully runs on the new datastructure * Update convert anndata * switch to obsm-based data structure * update get module * Update anndata schema check and _make_adata util function. * fix _make_adata * update fixtures * Fix a couple of tests * Re-add to_airr_cells * Fix couple more tests * Fix more IO tests * More IO tests [skip ci] * Cleanup has_ir * WIP fix clonotype neighbors [skip ci] * WIP fix distance tests * WIP fix clonotype cluster tests * Fix spectratype functions [skip ci] * Fix more tests * Fix IR dist tests [skip ci] * Fix tests for ir dist * Fix spectratype test [skip ci] * Tests for new upgrade_schema function [skip ci] * Workaround for group_abundance plot without has_ir column * Cleanup has_ir * Clean multi_chain [skip ci] * stub new index_chains function * WIP index_chain function [skip ci] * Add stub test for index_chains * Stub second test for index_chains * Complete second test for index_chains [skip ci] * index_chains tests * Update target version to v0.13 [skip ci] * add isort and autoflake * Fix circular import * Fix multichain handling (implement get._has_ir) * re-add fixtures * isort on tests [skip ci] * fix remaining IO tests * update todo flags [skip ci] * _is_na input sanitization already in AirrCell module [skip ci] By doing so, we can get rid of multiple todos. * Fix issue with plotting; get rid of merge_with_ir [skip ci] * Remove test for merge_with_ir [skip ci] * Ensure consistent ordering or chains in merge_airr * Complete unit tests for merge_airr [skip ci] * Use pre-commit.ci for black formatting * Bump minimum python version to 3.8 * Bump minimum python version to 3.8 * bump python version in CI tests * update imports of Literal * update pre-commit config [skip ci] * fix compat * WIP new chain_indices format * Fix get module * WIP fix tests * Fix tests [skip ci] * Fix dandelion tests * Update workflow tests * update min anndata version * Deprecate include_fields parameter and pass kwargs to from_airr_cells inIO * WIP update example datasets * update wu dataset generation * Update wu2020 dataset to mudata (preliminary) * First attempt to make tutorial work with mudata * fix issue with slicing awkward array when slice mask is empty * Change clonotype calling behavior for missing cdr3 sequences Previously, cells that had a receptor, but no sequence were treated differently from cells with no receptor: Previously cells with a receptor, but no sequence were assigned to a separate clonotype, while cells without a receptor got the clonotype `nan`. Now, also cells without sequence are assigned the clonotype `nan`. In practice, this shouldn't have affected a lot of people, as during IO, it was anyway ensured that only chains with a sequence are imported. * fix awkward type conversion in index_chains * Get rid of tqdm workaround which is not needed anymore * Update API in tutorial to what it *should* look like in the future * Stub parameter validation [skip ci] * implement params check class * update API docs * Apply new params check to first function * document params check * Remove anndata version check decorators * Restructure to fix cirular import [skip ci] * Unit tests for parms check * Fix notebook pairing * Params check in index_chains * update ir_dist with paramscheck [skip ci] * Apply pre-commit hooks to all files [skip ci] * Refactor ParamsCheck class * Refactor chain_qc * WIP implement param checks * Update type hints * Improve _ParamsCheck class [skip ci] * Fix typing in a couple of files. * Iterate on tutorial [skip ci] * Iterate on tutorial * Rename _ParamsCheck to DataHandler * Implement get_obs in DataHandler * WIP fix clonotype_network * Fix clonotype_network plot [skip ci] * Update clonal_expansion * Fix alpha diversity * Fix repertoire overlap and spectratype * Fix clonotype modularity * Fix ir_query [skip ci] * Fix clonotype convergence * Fix clonotype imbalance * Fix clonotype imbalance * Update processing scripts for Wu2020 * Update maynard loading script * disable check for same fields in AirrCell [skip ci] * Update maynard processing script * WIP tests with mudata * Update example datasets [skip ci] Use pooch to manage datasets. * Fix test for clonotype convergence * Experimental: use wrapper class for fixture * Remove outdated TODO statements * Revert "Experimental: use wrapper class for fixture" This reverts commit ddf5718. * Implement inplace logic in DataHandler * Parametrize fixtures to represent both AnnData and MuData [skip ci] * Use DataHandler to write results to obs. * WIP fix tests * Fix _get_colors [skip ci] * Fix tests * Fix test_get_color * Implement context managers in `get` module * Fix clustermap * Fix normalize in spectratype * Tutorial again complete 🎉 * Fix some open TODOs * Add tests for get context managers * update datasets module * Remove function cdr_convergence, which was never publicly documented anyway * Update some docstrings * remove erroneous import [skip ci] * WIP update docs * Update usage principles and data structure * Update MuData section [skip ci] * WIP update IO tutorial * Update IO tutorial * Update datastructure section with info about single AnnData object * Update main tutorial * Update API docs page * Minor doc amendments * WIP update docstrings * Fix docstrings * Fix TODOs * Fix sphinx warnings * update isort * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * constrain pandas * Pandas workarounds * Revert "Pandas workarounds" This reverts commit 6e19241. * pandas version * Fix problem with color by gene in clonotype_network * fix missing import in datasets * cancel previous CI jobs automaticallY * test ci * Concurrency should be outside 'jobs' * test ci * Update dependencies * Update conda dependencies Will fail, because anndata 0.9rc1 is not on conda. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information