Genetics Lecture 11 Notes
Uniqueness of DNA sequence:
In short space DNA can store lots of information
o How can we see this properly?
o How long does a DNA sequence have to be for us to expect that it is
unique in a genome?
o Have a high probability of occurring once?
o Basis for computer searches.
If A=T=G=C what are the sequence permutations of a sequence 1 base long?
o It has equal probability of being A, T, G, or C, therefore 4 permutations
o ¼ chance that the base is A
o Chance of particular sequence = 1/sequence permutations
If the sequence is 2 bases long, what is the chance that is has the sequence GT?
o (1/4)^4 = 1/16
o In the case of A = T = G = C
o 1/sequence permutations = ¼ (n=length)
How many total positions are there of that length within
the genome? How many positions are there within genomes?
The e^-381 is the change that this sequence
does not occur in the genome There is a good chance that you wouldn’t
find this sequence within a genome. The larger the sequence, the smaller the chance
you will find it in the genome.
Individual events – so just multiply the probability
of each event occurring (0.37^2)
Short sequences of 50 bases are not expected to occur at random in the
Therefore, if you take a 50 base sequence from a mouse and found that it
existed in humans, there is a significant, non-random find
o Would not expect same sequence to be in both a human & a mouse if
it is, this is a non-random find – 2 sequences are related