You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The index names for CSI indexed VCFs must be derived from the index itself, because sequence names in an indexed VCF refer to observed sequences, not those that are listed in the header. The correct logic (I hope) is here:
jeromekelleher
changed the title
CSI indexed VCFs have incorrect sequence names
partition_into_regions: CSI indexed VCFs have incorrect sequence names
Mar 3, 2024
The index names for CSI indexed VCFs must be derived from the index itself, because sequence names in an indexed VCF refer to observed sequences, not those that are listed in the header. The correct logic (I hope) is here:
https://github.com/jeromekelleher/bio2zarr/blob/880c3afee4465b4b94b921c815d436f3e4a78a46/bio2zarr/vcf_utils.py#L400
Some tests that should be straightforward to port to sgkit are here: https://github.com/jeromekelleher/bio2zarr/blob/880c3afee4465b4b94b921c815d436f3e4a78a46/tests/test_vcf_utils.py#L21
The text was updated successfully, but these errors were encountered: