Lecture 6 + 7.docx

7 Pages
Unlock Document

Ecology & Evolutionary Biology
Stephen Wright

Lecture 6:  Human-chimp divergence 4-7 mya  Human-chimp comparison:  Human chromosome 2 resulted from a fusion of two ancestral chromosomes that remained separate in the chimp lineage  Human-chimp differ by at least 9 pericentric inversions  Single nucleotide substitutions occur at a rate of 1.23% between copies of human and chimp genome  Fixed divergence is estimated to be less than 1.06%  Why are fixed divergence less than the SNP rate?  Within-species polymorphism where 2 different copies of the human genome (or 2 copies of the chimp genome) will not be identical.  To estimate the true difference between humans and chimps, we must account for diversity within each species  INDELs: insertions or deletion events  There are many fewer INDEL differences than nucleotide differences: 10 million INDELS  Most are small (45% cover one bp, 96% < 20 bp) but the largest few contain the most sequence  Largest 1.5% of indels contain around 73% of affected bps  These indels result in each species containing 45 Mb of species-specific euchromatic sequence. This causes around 3% of difference in sequence that is much larger than the 1% nucleotide divergence  Large INDELs account for the bulk of lineage-specific sequence in the two genomes  Nucleotide divergence:  There is quite a bit of variation in divergence among regions within chromosomes  Junk DNA where mutations can accumulate  Due to genetic drift different coalescence times  Selection increases divergence  Large difference between sex  Sex differences in mutation:  Number of cell divisions is 5-6 fold times greater for males than females  Y-chromosome has larger human-chimp divergence than autosomes  X-chromosome has lower human-chimp divergence than autosomes  X chromosome: There are 3 X chromosomes in total between a female and a male  Female: 2 out of 3 of the X chromosomes  Male: 1 out of 3 of the X chromosomes  Y chromosome: 100% in all males  Male sex chromosome mutations is 5 times higher than female mutation rate  Mutations arise from replication errors and also from DNA damage (and repair)  CpG sites experience a high rate of damage (deamination from methyl CpG to TpG)  CpG: where C and G are together  TpG: where T and G are together  CpG to TpG mutation rate is 10-50 times higher than other transitions  CpG sites mutation is driven by damage and not cell division (replication), it should show less difference between genders  CpG mutation alpha constant is only 2  Thus males have a higher mutation rate due to cell division since CpG alpha value is only 2  Divergence in protein-coding genes  13,454 pairs of chimp and human genes match exactly  There are 20 amino acids  Codons: triplets of nucleotide bases that spell the code for amino acids  4^3 = 64 possible triplets that allow degeneracy: structurally dissimilar components/modules/pathways can perform similar functions  Standard genetic code has 3 STOP codons  Point mutation within an exon: synonymous, non-synonymous(missense) and nonsense  Point mutation in non-coding DNA: silent mutation  Synonymous site divergence (Ks) is thought to reflect neutral substitution rate  Replacement sites divergence (KA) reflects rate of protein evolution  Change in peptide sequence can affect protein function  May be influenced by natural selection  For a given gene Ks acts as an “internal neutral control”  Different genes might experience slightly different mutation rates  Ks will quantify those differences  Protein evolution relative to “neutral” standard: KA/Ks  Neutrality: KA = Ks, so KA/Ks =1  Purifying (negative) selection results in slower substitution rate at selected sites than neutral sites: KA/Ks <1  Positive selection results in faster substitution rate at selected sites than neutral sites: KA/Ks >1 is possible due to elevation of KA  Fourfold degenerate site: any nucleotide at this position specifies the same amino acid. The third position of the glycine codons (GGA, GGG, GGC, GGU) is a fourfold degenerate site, because all nucleotide substitutions at this site are synonymous. Only the third positions of some codons may be fourfold degenerate  Twofold degenerate site: two of four possible nucleotides at this position specify the same amino acid. The third position of the glutamic acid codons (GAA, GAG) is a twofold degenerate site.  Number of synonymous sites = 1/3 n2 + n4  Number of replacement sites = n0 +2/3n2 = n-ns (Just know that the to find Ks, find the number of differences in sequences between humans and chimps in synonymous sites.  To find K observed, find the number of differences in sequences between humans and chimps in non-synonymous sites.  Human-Chimp Protein Divergence:  29% of genes are identical at the protein level  On average around 2 non-synonymous and around 3 synonymous substitutions per gene  Around 5% of proteins show in-frame INDELs (average size is 1 codon) usually in repetitive regions  KA/Ks = 0.23  Non-synonymous sites evolve at 23% as fast as synonymous sites: more synonymous mutations  If synonymous sites evolve at the neutral rate, then we might infer that 23% of non- synonymous substitutions are (effectively) neutral and evolving in the same rate as the synonymous sites, and the remaining 77% are constrained by selection  Constraint (by selection) = 1- (observed rate)/(neutral rate)  Assumptions: 1. Ks reflects the neutral rate 2. All non-synonymous substitutions are effectively neutral (negligible adaptive substitutions)  We usually naturally assume that the neural rate is high and the observed rate is low  If synonymous sites are also constrained, then we will under-estimate the level of constraint on non-synonymous sites  The neutral rate is actually smaller due to some synonymous sites being constrained due to selection, thus the actual constraint is higher. Thus we are under-estimating the constraint  We are underestimating the neutral rate (originally we thought that constraint due to neutral rate is zero, however there is some constraint on the neutral rate also) , thus we are underestimating the constraint on the non-synonymous sites since the constraint on neutral site is not zero, and thus relatively speaking non- synonymous sites constraints should be higher also.  If 10% of non-synonymous substitutions are actually due to adaptive substitutions, then we will under-estimate the level of selective constraint on non-synonymous sites.  We assume that all non-synonymous substitutions are due to random chance, however if 10% of them are due to adaptation then constraint is actually higher.  The observed rate due to constraint is actually lower, thus the actual constraint is higher.  The lower the observe
More Less

Related notes for EHJ352H1

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.