Class Notes (808,754)
Canada (493,378)
Biology (2,097)
BIO314H5 (7)
Lecture 10

BIO314H5 Lecture 10: BIO314 Lab 10 BLAST

7 Pages
Unlock Document

University of Toronto Mississauga
George S Espie

BIO314 Lab 10 – Introductory Bioinformatics (Part 1) 1 - Databases & other tools 2 – Genetic screening using BLAST queries and alignments Section 1 – Some important resources – databases, tools, etc. Nucleotide & Genomic Databases NCBI Nucleotide (GenBank, RefSeq, etc.): Ensembl Genome Browser (EMBL): Online Mendelian Inheritance in Man (genes & genetic disorders): SNPedia (human genetic variations): FlyBase (Drosophila vineagar fly model genome): XenBase (Xenopus frog model genome): WormBase (Caenorhabditis nematode model genome): Protein & Proteomics Databases: UniProt (sequences, alignment, supporting data, etc.): World-wide Protein Databank (structures): ProteomeScout (processed proteomic datasets): GelMap (proteins ID'd on 2D Gels: Expression & Microarray Databases: ArrayExpress: Gene Expression Omnibus (NCBI Geo): Genevestigator Expression Search Engine: Pathways and Metabolic Databases: Kyoto Encyclopedia of Genes and Genomes (KEGG): Comparative Toxicogenomics Database (interactions between toxins and genes): Reactome (reactions, pathways, bioprocesses): Enzyme Portal (small molecule chem, pathways, etc.): Taxonomic Databases: BOLD Systems (DNA Barcoding): USDA PLANTS Database (taxonomy & species ranges in NA, native vs. non-native, benign vs. invasive, identification keys, etc.): Meta Databases: NCBI Entrez: When beginning work on a particular research question, in addition to a survey of the literature it is often helpful to examine what other relevant information and pre-existing experimental data is available from various databases. Generally speaking, it is easiest to begin your search in a meta database, i.e. a database itself calls up information from many other, more subject specific, databases. As an example of how one might do this, we will do a simple query using the meta database Entrez from the American National Center for Biotechnology Information, or NCBI. 1) Navigate to the NCBI Entrez main page using link on previous page. Below the search bar, you will see the names of all the databases that Entrez searches and a short description of what each contains. 2) Search the term Marfan Syndrome. In the space below the search bar, numbers of results will appear next to each of the databases that Entrez draws from. 3) Click on the OMIM link. Note how only the second search result here is actually about Marfan Syndrome (the first concerns another disorder with some similarities in phenotypic outcome). Using these databases is similar to internet search engines – you will often sweep up a fair amount of irrelevant information along with what you are interested in. 4) Click on the Marfan Syndrome OMIM result, and have a quick glance at the report on this heritable connective tissue disorder. Note its well-developed summary of the primary literature concerning this disorder. Navigate back to the Entrez search results page and spend some time examining other pages. Section 2 - BLAST queries and alignments & basic genetic testing 2.1 Introduction Comparing nucleotide or protein sequences from the same or different organisms is a very powerful tool in molecular biology. We can infer the function of newly sequenced genes, predict new members of gene families, and explore evolutionary relationships by finding similarities in sequences. Now that whole genomes are being sequenced, sequence similarity searching can even be used to predict the location and function of protein-coding and transcription-regulation regions in genomic DNA. The Basic Local Alignment Search Tool, or BLAST, is perhaps the tool used most frequently for basic calculations of sequence similarity. There are a number of variations of BLAST for use with different query sequences against different databases. Most people use BLAST by entering a nucleotide or protein sequence into the textbox on one of the BLAST web interfaces hosted by the NCBI and submitting it as a query against all (or a subset of) public sequence databases. The search is performed on the NCBI databases and servers, and after a processing delay, the results will show in the person's browser in the chosen display format. Many biotechnology companies, genome scientists, and bioinformatics personnel also use a “stand-alone” version of BLAST to query their own, local databases. They may also customize BLAST in some way to make it better suit their needs. There are a number of BLAST variations for different kinds of sequence comparisons, e.g., a DNA query to a DNA database, a protein query to a protein database, and a DNA query, translated in all six reading frames, to a protein sequence database. There are even more advanced versions of BLAST for special queries, such as PSI-BLAST (for iterative protein sequence similarity searches using a position- specific score matrix) and RPS-BLAST (for searching for protein domains in the Conserved Domains Database). 2.2 BRCA1 Mutations, Breast Cancer, and Genetic Testing There are a number of mechanisms in the body to repair cellular DNA damage, each involving sets of proteins that repair a different type of damage, such as single-strand insertion errors, single- and double-strand breaks, and nucleotide dimerization. If such damages are not repaired, one potential outcome is the development of the cell into pre-cancerous tissues as it divides, and ultimately into cancerous tissue as more mutations accumulate. Consequently, proteins involved in these processes are categorized as a tumor supressors, because their functioning helps prevent the formation of tumors. BRCA1, “breast cancer associated, early onset 1”, is one such tumor suppressor. The lifetime risk of developing breast cancer is 12%, meaning 120 women in every 1000 will develop the disease at some point in their lives. Inherited mutations to the BRCA genes are found in 5% of all breast cancer cases. When a woman inherits a cancer-causing BRCA mutation, her risk of developing breast cancer increases by up to 85%, depending on the exact mutation. Furthermore, mutations to BRCA genes also increase the risk of ovarian cancer from the population average of 2% to between 16 and 60%. The BRCA1 gene has 22 exons and spans ~110kb of DNA at the autosomal locus 17q21. It encodes a nuclear phosphoprotein that combines with other tumor suppressors, DNA damage sensors, and signal transducers to form a large multi-subunit protein complex known as the BRCA1-associated genome surveillance complex (BASC). In particular, BRCA1 products associate with RNA polymerase II, and through its C-terminal domain it also interacts with histone deacetylase complexes. Therefore, this protein plays a role in transcription along with DNA repair of double-stranded breaks and recombination. Alternative splicing plays an important role in the subcellular localization and physiological function of this gene. Many alternatively spliced variants have been described for BRCA1, some
More Less

Related notes for BIO314H5

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.