Skip to content

v0.28.3 - cosmic, invertebrates for ref and search, elm improvements

Compare
Choose a tag to compare
@lauraluebbert lauraluebbert released this 22 Jan 21:42
· 695 commits to main since this release
5cad2f3
  • gget search and gget ref now also support fungi 🍄, protists 🌝, and invertebrate metazoa 🐝 🐜 🐌 🐙 (in addition to vertebrates and plants)
  • New module: gget cosmic
  • gget enrichr: Fix duplicate scatter dots in plot when pathway names are duplicated
  • gget elm:
    • Changed ortho results column name 'Ortholog_UniProt_ID' to 'Ortholog_UniProt_Acc' to correctly reflect the column contents, which are UniProt Accessions. 'UniProt ID' was changed to 'UniProt Acc' in the documentation for all gget modules.
    • Changed ortho results column name 'motif_in_query' to 'motif_inside_subject_query_overlap'.
    • Added interaction domain information to results (new columns: "InteractionDomainId", "InteractionDomainDescription", "InteractionDomainName").
    • The regex string for regular expression matches was encapsulated as follows: "(?=(regex))" (instead of directly passing the regex string "regex") to enable capturing all occurrences of a motif when the motif length is variable and there are repeats in the sequence (https://regex101.com/r/HUWLlZ/1).
  • gget setup: Use the out argument to specify a directory the ELM database will be downloaded into. Completes this feature request.
  • gget diamond: The DIAMOND command is now run with --ignore-warnings flag, allowing niche sequences such as amino acid sequences that only contain nucleotide characters and repeated sequences. This is also true for DIAMOND alignments performed within gget elm.
  • gget ref and gget search back-end change: the current Ensembl release is fetched from the new release file on the Ensembl FTP site to avoid errors during uploads of new releases.
  • gget search:
    • FTP link results (--ftp) are saved in txt file format instead of json.
    • Fix URL links to Ensembl gene summary for species with a subspecies name and invertebrates.
  • gget ref:
    • Back-end changes to increase speed
    • New argument: list_iv_species to list all available invertebrate species (can be combined with the release argument to fetch all species available from a specific Ensembl release)