Roadmap #1

Hua-Zhou · 2017-08-29T04:24:47Z

This issue documents the implementation roadmap for iterative solvers.

Convert GT data to Plink format (SnpData)
Subset VCF records according chromosome and position range
Subset VCF records according to marker index or IDs
Subset VCF samples according to individual index or names
Convert methods for VCF.Reader, e.g., convert(Matrix{T}, reader::VCF.Reader) will convert all records from the current position of reader to a matrix of type Matrix{T}
Copy methods for VCF.Reader, e.g., copy(A::AbstractMatrix{T}, reader::VCF.Reader) will fill the columns of A the GT data from the current position of reader

The text was updated successfully, but these errors were encountered:

biona001 · 2019-11-25T18:03:19Z

Below is a (tentative) list of features I need for MendelImpute:

filter function based on sample and record index
convert_ht function to import VCF files into a numeric matrix where columns are haplotypes
convert_ds function to read dosage into a numeric matrix
convert_vcf function to convert phased genotype matrix back to a VCF file. This is not implemented explicitly but is support with general write methods. See MendelImpute's impute.jl

I will try to implement them in the next 1~2 weeks.. Are there any caveats I need to be aware of? In particular, I'm not sure what is the best way to filter based on sample index due to the data structure of VCF.Record.

biona001 · 2020-05-20T04:16:25Z

Here are a few more desired routines typically needed for quality control:

splitting multi-allelic calls into different records
left-aligning indels
removing 'junk' variants that have a high statistical probability of being false-positives via the QUAL score, which is negative log 10 Phred-scaled quality.

They are mentioned here

kose-y · 2020-07-30T12:24:59Z

see https://github.com/vcftools/vcftools/blob/master/src/cpp/variant_file_format_convert.cpp for vcf to plink conversion.

kose-y · 2020-08-02T05:01:30Z

VCF to PLINK is implemented in SnpArrays.jl.

biona001 · 2020-09-02T17:08:48Z

In the next few days, I will add:

Calculation of Hardy Wienburg equilibrium p-value using Fisher's exact test (reference)
GRM calculation, with emphasis on treating missing data (reference)

Hua-Zhou · 2020-09-02T17:32:29Z

👍

biona001 mentioned this issue Sep 3, 2020

WIP: grm functions and fisher's test for HWE #16

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap #1

Roadmap #1

Hua-Zhou commented Aug 29, 2017 •

edited by kose-y

Loading

biona001 commented Nov 25, 2019 •

edited

Loading

biona001 commented May 20, 2020 •

edited

Loading

kose-y commented Jul 30, 2020

kose-y commented Aug 2, 2020

biona001 commented Sep 2, 2020

Hua-Zhou commented Sep 2, 2020

Roadmap #1

Roadmap #1

Comments

Hua-Zhou commented Aug 29, 2017 • edited by kose-y Loading

biona001 commented Nov 25, 2019 • edited Loading

biona001 commented May 20, 2020 • edited Loading

kose-y commented Jul 30, 2020

kose-y commented Aug 2, 2020

biona001 commented Sep 2, 2020

Hua-Zhou commented Sep 2, 2020

Hua-Zhou commented Aug 29, 2017 •

edited by kose-y

Loading

biona001 commented Nov 25, 2019 •

edited

Loading

biona001 commented May 20, 2020 •

edited

Loading