GENE30001 Lecture Notes - Lecture 7: Linkage Disequilibrium, Selective Sweep, Haplotype

32 views2 pages
Here we can see that each highlighted column has
at least 1 allele with variation on a site.
BECAUSE IT IS TOO HARD TO COMPARE ALL OF
THEM, need to do average pair-wise divergence
THIS GIVES ME THEM TAJIMA D'S.
On this slide we can count on each column
What is done here is you take two alleles at a time, one allele has 26 nucleotide, compare how many
nucleotide sites are different. Once all of them are found find the average. Pi is = nucleotide diversity
Theta = expected nucleotide diversity
This leads to our + and - Tajima D's
Tajima's D is the difference between the
average number of pairwise divergence and
the number of segregating sites
The purpose of the test is to distinguish between aDNA
sequence evolving randomly ("neutrally") and one evolving under
a non-random process, including directional selectionor balancing
selection, demographic expansion or contraction,genetic
hitchhiking, or introgression.
Tajima's D is computed as the difference between two measures
of genetic diversity: the mean number of pairwise differences and
the number of segregating sites, each scaled so that they are
expected to be the same in a neutrally evolving population of
constant size.
Interpreting Tajima's D[edit]
A negative Tajima's D signifies an excess of low frequency polymorphisms relative to expectation, indicating population size expansion (e.g., after a bottleneck or a selective sweep)
and/or purifying selection. A positive Tajima's D signifies low levels of both low and high frequency polymorphisms, indicati ng a decrease in population size and/or balancing selection.
However, calculating a conventional "p-value" associated with any Tajima's D value that is obtained from a sample is impossible. Briefly, this is because there is n o way to describe the
distribution of the statistic that is independent of the true, and unknown, theta parameter (no pivot quantity exists). To ci rcumvent this issue, several options have been proposed.
Value of
Tajima's D
Mathematical reason
Biological interpretation 1
Biological interpretation 2
Tajima's D=0
Theta-Pi = Theta-k (Observed=Expected). Average Heterozygosity=
# of Segregating sites.
Observed variation similar to
expected variation
Population evolving as per mutation-drift equilibrium. No
evidence of selection
Tajima's D<0
Negative
Theta-Pi < Theta-k (Observed<Expected). Fewer haplotypes (lower
average heterozygosity) than # of segregating sites.
Rare alleles present at low
frequencies
Recent selective sweep, population expansion after a
recent bottleneck, linkage to a swept gene
Tajima's D>0
Positive
Theta-Pi > Theta-k (Observed>Expected). More haplotypes (more
average heterozygosity)than # of segregating sites.
Multiple alleles present, some at low,
others at high frequencies
Balancing selection, sudden population contraction, due to
bottleneck due to closeness and enrichment
However, this interpretation should be made only if the D-value is deemed statistically significant.
From <https://en.wikipedia.org/wiki/Tajima%27s_D#Interpreting_Tajima.27s_D>
Linkage Disequilibrium
Definition: The non-random association of alleles at different loci in a population
(due to drift, mutation and recombination)
Haplotype: one combination of allelic states inherited together
Recombination erodes LD over time
Mutation, drift and limited recombination
-
Demographic effects
-
Selective sweeps
-
Factors affecting LD
The analysis of association between SNPs and complex diseases that
have a genetic component is known as linkage disequilibrium mapping.
DEFINITIONS:
Genetic hitch-hiking: The process by which selectively neutral alleles increase or decrease in
frequency due to their association with
alleles that are under the influence of natural selection.
Selective sweep: The reduction or elimination of polymorphism in a region of DNA sequence
surrounding a site where a beneficial mutation has increased in frequency due to positive
natural selection. The reduction of polymorphism is a result of gametic disequilibrium
between a beneficial mutation and neighboring neutral sites that has not been broken down
by recombination.
Scenario: CCR5triangle32
Estimate age of a variant by the length of the haplotype it occurs on
Probability that a haplotype remainds intact (P)
P = (1-c)^G
Where c = recombination rate b/w 2 polymorphisms
G = number of generations
Lecture 7 and 8 Measures of Nucleotide Diversity
Wednesday, 22 June 2016
4:23 PM
Lecture Notes 1- 12 Molecular Evolution and Population Genetics Page 1
Unlock document

This preview shows half of the first page of the document.
Unlock all 2 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Lecture 7 and 8 measures of nucleotide diversity. Here we can see that each highlighted column has at least 1 allele with variation on a site. Because it is too hard to compare all of. On this slide we can count on each column. What is done here is you take two alleles at a time, one allele has 26 nucleotide, compare how many nucleotide sites are different. Once all of them are found find the average. This leads to our + and - tajima d"s. Tajima"s d is the difference between the average number of pairwise divergence and the number of segregating sites. The purpose of the test is to distinguish between a dna sequence evolving randomly (neutrally) and one evolving under a non-random process, including directional selection or balancing selection, demographic expansion or contraction, genetic hitchhiking, or introgression.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers