You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is particularly problematic for the mutyper targets subcommand, since it scans through all sites in a fasta record, or a sequence of bed regions.
The current workaround is to work with decompressed fasta data. A bgzipped fasta, e.g. named ancestor.fa.gz can be decompressed with bgzip -d ancestor.fa.gz to produce an uncompressed fasta ancestor.fa.
The text was updated successfully, but these errors were encountered:
Running into this again with a new dataset, and it's wild the difference in performance this makes. Ran the job with the bgzipped fasta for >2 days and only got 400MB through a vcf file, and now with the unzipped fasta am already at 1.5GB after an hour. Maybe consider throwing a warning or error if someone tries to input a compressed ancestral fasta? it's virtually unusable when it's that slow
Accessing later regions of a fasta via a
mutyper.Ancestor
object (child class ofpyfaidx.Fasta
) is not performant, likely stemming from this issue in pyfaidx: mdshw5/pyfaidx#153.This is particularly problematic for the
mutyper targets
subcommand, since it scans through all sites in a fasta record, or a sequence of bed regions.The current workaround is to work with decompressed fasta data. A bgzipped fasta, e.g. named
ancestor.fa.gz
can be decompressed withbgzip -d ancestor.fa.gz
to produce an uncompressed fastaancestor.fa
.The text was updated successfully, but these errors were encountered: