BINF 511 Lecture Notes - Lecture 6: Mollusca, Puffy Amiyumi, Pilot Experiment

16 views5 pages
Lecture 6C: Bioinformatics in microbiological food quality and safety
Guest lecturer: Jennifer Ronholm
February 14, 2018
Foodborne bacteria genetics problems/questions
Whole genome sequencing (WGS) is now being used in outbreak delineation
oOutbreak delineation requires an answer to a single question: are these two strains the
same?
oHow do we define if two strains are the same using WGS?
oHow different can two genomes be before they're not the same anymore?
165 rRNA targeted amplicon sequencing is commonly used to characterize the microbiome of
environmental samples
oHow does the microbiome of our food affect its quality and safety?
oCan we manipulate the microbiome of food/food animals to improve quality and safety?
Mutagenome assembly
oCulture independent diagnostics is turning to the metagenome
oWhen there are closely related phyla in a sample, how can we assign the right genes to
the right genomes? It's hard to assembly perfectly.
Outbreak delineation -> need to answer the question "are these two isolates the same?"
Subtyping
oOutbreak delineation requires that the etiological agent be subtyped at a higher
resolution than species to determine if the isolates from infected people, food, and food
source are the same
Who got sick first and why?
oCurrent/historical ways of doing this:
Serotyping
Pulsed filed gel electrophoresis (PFGE)
Problem: there's not an infinite number of patterns that can be on the
gel and can be seen by the eye
Multi locus sequencing typing
oNone of these techniques uses whole genome; therefore, none of these techniques
have the same resolution as WGS
Example: cholera outbreak in Haiti in 2010
oTwo were tested on PFGE, but many matches (7 countries got matched)
oThis was not a conclusive test
Whole genome sequencing
How to solve an outbreak using WGS?
o1. Sequence
Foodborne pathogens have genomes which average about 5 Mb each
MiSeq platform is sufficient to sequence an isolate to draft status
This will leave some base pairs un-sequenced
o2. Assemble (there are many available software platforms, but there are 2 types of
assemblies)
Reference guide assembly
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 5 pages and 3 million more documents.

Already have an account? Log in
Maps short sequence reads by assessing the placement of each read
against the reference genome, and calculating the probability of its match with
the reference, these alignments are then used to construct a novel consensus
sequence for the sequence data
Putting puzzle pieces together roughly with the image in mind
Problem/major con: inherent bias because you're referring to a
picture, even though it may be wrong
Other major cons: data is missing (i.e. reads that don't have an analog in
reference), where to put repetitive regions?, cannot make synteny conclusions
Major pros: fast (not memory intensive), readily call SNPS, provides
more data with less coverage
De novo assembly
Assembles genomes based on sequence overlap without the use of a
reference genome
You don't know what it's supposed to look like, so it's hard to put the
puzzle pieces together
Major pros: unbiased, maps all reads
Uses graphs (algorithms) to assemble the data
String graphs: representation of all overlaps between reads, fast
to construct for exact matches
De bruijn graphs: extremely popular model of assembly, fast to
construct
Major cons: harder to map SNAPs, more memory intensive, takes much
longer than reference guide assembly, cannot map repetitive sequences, more
contigs (on average)
Things that make short read assembly difficult
Repetitive sequences
High heterozygosity
Low coverage - each bp is sequenced by a short read, you want 30-50
(<30 you may not see error)
Biased sequencing
High error rate
Chimeric reads - too much on your chip, two parts of the genome
Sequencing adapters in the reads
Sample contamination
oArchive (Pulse-Net, NCBI, EMBL)
oAnalyze
Identify virulence factors/antimicrobial resistance factors
Generate in silico results for most of the traditional tests: serotyping, MLST,
OFGE
Determine if two strains are the same or not using SNPs (reference guided, de
novo)
Apps are now available to help analyze WGS for outbreak delineation and strain
characterization
To answer the question :are two strains the same or not?", we need to call SNPs
SNP = single nucleotide polymorphism
How do you call SNPs
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 5 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Lecture 6c: bioinformatics in microbiological food quality and safety. 165 rrna targeted amplicon sequencing is commonly used to characterize the microbiome of environmental samples: how does the microbiome of our food affect its quality and safety? o. Outbreak delineation -> need to answer the question "are these two isolates the same?" Subtyping: outbreak delineation requires that the etiological agent be subtyped at a higher resolution than species to determine if the isolates from infected people, food, and food source are the same. Problem: there"s not an infinite number of patterns that can be on the gel and can be seen by the eye. Multi locus sequencing typing: none of these techniques uses whole genome; therefore, none of these techniques have the same resolution as wgs. Example: cholera outbreak in haiti in 2010 o o. Two were tested on pfge, but many matches (7 countries got matched) How to solve an outbreak using wgs? o: sequence.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers