Releases · suhrig/arriba

08 Feb 10:48

suhrig

v2.4.0

786f647

Arriba v2.4.0 Latest

Latest

new utility script to annotate exon numbers
compatibility with Illumina's Dragen aligner (see notes in manual about supported aligners)
retained fraction of protein domain was often overestimated as 100%
better agreement of transcripts between Arriba's output file and the visualizations produced by draw_fusions.R by making --transcriptSelection=provided the default
better matching of structural variant breakpoints to fusion breakpoints when parameter -d is used
VCF files generated by scripts/convert_fusions_to_vcf.sh are now compatible with bcftools
mildly improved filtering

Assets 3

29 May 10:42

suhrig

v2.3.0

3a88da0

Arriba v2.3.0

blacklist PhiX genome, since it is often used as spike-in control
stricter filtering of read-through fusions
fix broken compilation due to outdated zlib URL (thanks to @iainrb)
updated protein domain annotation files (GFF3), now with 7-15% more annotation records
updated reference files in download_references.sh to match protein domain annotation version
download_references.sh did not properly harmonize chromosome names between assembly (FastA) and annotation (GTF) when an assembly with chr prefix was used (hg19/38, mm10/39), which had minor implications on alignment and fusion calling
coverage plots can be scaled separately and/or to a user-defined cutoff (--coverageRange=...)
scripts are now compatible with macOS (a recent version of bash must be installed, though; the preinstalled version 3.2 is too old)
minor fixes for reading frame prediction when breakpoint is close to first/last exon

Contributors

iainrb

Assets 3

22 Jan 15:16

suhrig

v2.2.1

6c016ca

Arriba v2.2.1

reverted a change introduced in v2.2.0: download_references.sh now uses the ENSEMBL GRCh38 assembly (FastA) again instead of the ICGC-ARGO assembly, because the latter contains ALT contigs, which is not recommended for alignment using STAR according to the STAR user manual; moreover, due to scripting error, the GRCh38 assembly generated by downloaded_references.sh contained malformed data at the end of the file, which is now fixed as well

Assets 3

16 Jan 13:03

suhrig

v2.2.0

bbc4f0b

Arriba v2.2.0

improved detection of internal tandem duplications
better sensitivity for the detection of viral integration sites
inclusion of additional ~4500 viruses into screening, including rare strains of cancer-associated viruses (requires rebuild of STAR index)
viral contigs were renamed to be compliant with the SAM format specification (requires rebuild of STAR index)
support for mm39/GRCm39
utility scripts (see also manual):
- quantify virus expression
- convert Arriba's custom output format to VCF
- extract fusion-supporting alignments into separate mini-BAM
- running Arriba on a prealigned BAM file and realigning only the fusion candidate reads saves ~80% of the CPU time compared to a complete realignment (useful when the alignments were generated by an old STAR version or by a different aligner such as HISAT2)
polishing of fusion visualizations created by draw_fusions.R and new features:
- all transcripts can be drawn at the same scale if desired (--fixedScale)
- circos plots have same size across all pages
- set PDF title and print as header on every page (--sampleName)
- fine-grained control over region to draw for intergenic breakpoints (--showIntergenicVicinity)
- choose a different font (--fontFamily)
- better scaling for coverage track
more fixes for prediction of reading frame
better warnings and error messages
updated STAR to version 2.7.10a, which fixes malformed chimeric alignments for paired-end reads with small insert size
updated dependencies (HTSlib, libdeflate)

Assets 3

24 Jan 15:31

suhrig

v2.1.0

3492d2c

Arriba v2.1.0

Arriba can now be cited
arcs in circos plot are colored by type of rearrangement
internal tandem duplications are flagged with the keyword ITD in Arriba's output file
more effective filtering of germline polymorphism internal tandem duplications
draw_fusions.R loads reference files faster
under some rare conditions, the reading frame was erroneously predicted as out-of-frame

Assets 3

11 Oct 13:54

suhrig

v2.0.0

156af31

Arriba v2.0.0

report viral integration sites
report fusions supported by multi-mapping reads (e.g., CIC-DUX4, NPM1-ALK)
report internal tandem duplications (e.g., FLT3, BCOR, ERBB2, NOTCH1)
improved detection of IG/TCR rearrangements
known fusions file based on the Mitelman database is now part of the download
more comprehensive annotation (gene IDs, transcript IDs, user-defined tags, retained protein domains)
support for mouse (mm10)
(optionally) report the full transcript/peptide sequence (parameter -I) rather than only what can be assembled from the supporting reads
structural variants can be supplied in VCF format (parameter -d)
MacOS support
faster loading of BAM files thanks to HAT-trie map as well as other speed improvements
draw_fusions.R accepts the format of STAR-Fusion
ability to make use of external duplicate marking, e.g., for UMIs (parameter -u)
enhanced blacklist
simplified code compilation procedure
support assemblies with up to 65,000 contigs (previously 32,000)

Important compatibility notes when upgrading from version 1.x:

STAR version >= 2.7.6a is required to make use of multi-mapping chimeric reads
new columns were added to the output files and some were rearranged
the parameter -P is obsolete; the parameters -I and -T have been repurposed
parsing of input TSV files (GTF, known fusions, blacklist, structural variants) is now stricter
the order of the genes in the known fusions file (parameter -k) is now important
the reading_frame column may contain the new value stop-codon
the site1/2 columns may contain new values
the parameters of the run_arriba.sh script have changed
the download_references.sh script is now parameterized using environment variables
the chr prefix is no longer removed from the output files
the alignment parameters of run_arriba.sh are set to report up to 50 multi-mapping reads
some filters were removed/renamed, which is relevant if the parameter -f is used

Assets 3

04 Jan 13:28

suhrig

v1.2.0

ca1d40b

Arriba v1.2.0

better filtering of in vitro-generated artifacts
known_fusions filter is more sensitive
update dependencies (HTSlib, compression libs)
example data
under some (rare) conditions, reading frame was incorrect
documentation provides tips on how to interpret fusion predictions
better error messages for common cases of incorrect usage

Assets 3

25 Mar 15:35

suhrig

v1.1.0

477f92b

Arriba v1.1.0

speed improvements (BAM file loading, low_entropy filter, homologs filter, GTF parsing)
prebuilt Docker image available at Docker Hub
installation via bioconda
improved confidence scoring
better detection of intragenic rearrangements
new blacklist
fix some non-deterministic behavior
more reliable auto-detection of strandedness
protein domains were drawn in incorrect order for genes on the reverse strand
handle empty input files more reasonably

Assets 3

23 Oct 22:41

suhrig

v1.0.1

462f69c

Arriba v1.0.1

fix bugs in parsing of command-line arguments
do not compress intermediate (unsorted) BAM file to avoid performance bottleneck

Assets 3

14 Oct 11:20

suhrig

v1.0.0

d4bec52

Arriba v1.0.0

streamlined workflow (extract_read-through_fusions is obsolete)
generate publication-quality figures of fusions
predict peptide sequences
protein domain track for loading into IGV
Singularity recipe
CRAM support
simplified installation procedure
fix off-by-one error (=> new blacklists!)
improved sensitivity/specificity

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Releases: suhrig/arriba

Arriba v2.4.0

Arriba v2.3.0

Contributors

Arriba v2.2.1

Arriba v2.2.0

Arriba v2.1.0

Arriba v2.0.0

Arriba v1.2.0

Arriba v1.1.0

Arriba v1.0.1

Arriba v1.0.0