Class Notes (808,754)
Canada (493,378)
Biology (6,677)
Lecture 19

Biology 2581B Lecture 19: Lecture 19 genetics 2581

4 Pages
Unlock Document

Western University
Biology 2581B
David R Smith

Lecture 19  Personalized genomics in healthcare are going to become hugely influenced by bioinformatics  Really easy to get the sequence read, but we really only get small pieces  We have all this data but how do we put it together  You start looking for overlaps to put things together  25 nucleotides that match - a pretty good bet that they belong together  4n  But now we have repeats in there  You can put them together since they share identical repeats Sequencing Reads  So what do you need? o You need another read that spans that whole repeat and then anchors one of the original o But in many genomes the repeats are so long, that you wouldn't get o Which is why you have sections that you just haven't been able to assemble  To put them together, the computer looks for overlaps  As the sequencing reads come off the machine, some are good but some are bad  The key is to find sequences that span the repeat  Algorithms used to evaluate how good they are  BLAST - the database that you use to figure out what you're looking at o Against a database of known DNA  BLASTN - comparing nucleotides  TBLASTX - take unknown sequence o Translate all 6 frames and then search those against the data base of the same thing  Aka you take every nucleotide run in the database and convert it to the 6 frames  BLASTP - protein vs protein (amino acid sequences)  You have an unknown sequence and we are going to tBlastX it since there’s a greater chance that it is protein coding  So you're looking at hits from the database and it hits the COX1 genes  The coverage is good; meaning our search query is covered almost completely by the hits  The type of hits look consistent - they're all the same thing and the score is high o When you blast you're given scores like E values, etc o How similar is your unknown to the hits?  If the types of hits are all over the place and not consistent, then it doesn’t make sense o If the score is low, then the percent identity between the unknown and the hits isn't very good  But maybe it was never a protein coding sequence so we shouldn't use the translation of it to search - use BLASTN o You can find consistent type of hits  Then you
More Less

Related notes for Biology 2581B

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.