-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Statistics for measuring how we do with real data #41
Comments
On average, individuals from the same demographic area should share the vast majority of most recent coalescences with each other. Another way to measure this would be to use Shiffels & Durbin's 'cross coalescence rate' |
Chatted to George Busby about 1000G data: his project with Ryan Christ involved chromosome painting with some of the 1000G data + a focal population of Africans with and introgressed lactose tolerance haplotype, and looking for areas of shared ancestry within the focal African population (ancestry was estimated by splitting the 1000G data into e.g. 6 populations & basing ancestry measures on haplotype prevalence within each pop). We might be able do this sort of thing with ancestors on trees instead. Another suggestion would be to look across the duffy locus, which we know to have been under selection. One issue is that this is on Chromosome 2, which is the largest human chromosome. |
Just chatting to Wilder - I wonder if we can plot a "densitree" using a random subsample of the data (both haplotypes and genomic positions) |
Also, if there are any individuals in 1000G who have admixed parents (e.g. one maternal grandparent african, the other european), then we might be able to see large chunks where the genome shows more close relationship with africans and another chunk with europeans. |
Ww want to run tsinfer on real data and see how we do, compared to what we might expect. This issue collects some ideas for how to do that.
The text was updated successfully, but these errors were encountered: