GitHub - Jong-hun-Park/adVNTR: A tool for genotyping Variable Number Tandem Repeats (VNTR) from sequence data

adVNTR - A tool for genotyping VNTRs

adVNTR is a tool for genotyping Variable Number Tandem Repeats (VNTR) from sequence data. It works with both NGS short reads (Illumina HiSeq) and SMRT reads (PacBio) and finds diploid repeating counts for VNTRs and identifies possible mutations in the VNTR sequences.

Installation

If you are using the conda packaging manager (e.g. miniconda or anaconda), you can install adVNTR from the bioconda channel:

conda config --add channels bioconda
conda install advntr

adVNTR could be invoked from command line with advntr

Alternatively, you can install dependencies and install the adVNTR from source.

Data Requirements

In order to genotype VNTRs, you need to either train models for loci of interest or use pre-trained models (recommended):

To run adVNTR on trained VNTR models:
- Download vntr_data_recommended_loci.zip and extract it inside the project directory. This includes a set of pre-trained VNTR models for Illumina (6719 loci) and Pacbio (8960 loci) sequencing data.
- You can also download and use vntr_data_genic_loci.zip for 158522 VNTRs that results in having much longer running time.

Alternatively, you can add model for custom VNTR. See Add Custom VNTR for more information about training models for custom VNTRs.

Execution:

Use following command to see the help for running the tool.

advntr --help

The program outputs the RU count genotypes of trained VNTRs. To specify a single VNTR by its ID use --vntr_id <id> option. The list of some known VNTRs and their ID is available at Disease-linked-VNTRs page in wiki.

See the demo below or Quickstart page to see an example data set with step-by-step genotyping commands.

Demo input in BAM format

--alignment_file specifies the alignment file containing mapped and unmapped reads:

    advntr genotype --alignment_file aligned_illumina_reads.bam --working_directory ./log_dir/

With --pacbio, adVNTR assumes the alignment file contains PacBio sequencing data:

    advntr genotype --alignment_file aligned_pacbio_reads.bam --working_directory ./log_dir/ --pacbio

Use --frameshift to find the possible frameshifts in VNTR:

    advntr genotype --alignment_file aligned_illumina_reads.bam --working_directory ./log_dir/ --frameshift

Documentation:

Documentation is available at advntr.readthedocs.io.

See Quickstart page to see an example data set with step-by-step genotyping commands.

Citation:

Bakhtiari, M., Shleizer-Burko, S., Gymrek, M., Bansal, V. and Bafna, V., 2018. Targeted genotyping of variable number tandem repeats with adVNTR. Genome Research, 28(11), pp.1709-1719.

Name		Name	Last commit message	Last commit date
Latest commit History 662 Commits
advntr		advntr
docs		docs
filtering		filtering
pomegranate		pomegranate
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
build_advntr_filtering.sh		build_advntr_filtering.sh
requirements-linux.txt		requirements-linux.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

adVNTR - A tool for genotyping VNTRs

Installation

Data Requirements

Execution:

Demo input in BAM format

Documentation:

Citation:

About

Releases

Packages

Languages

License

Jong-hun-Park/adVNTR

Folders and files

Latest commit

History

Repository files navigation

adVNTR - A tool for genotyping VNTRs

Installation

Data Requirements

Execution:

Demo input in BAM format

Documentation:

Citation:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages