PSY274H5 Lecture Notes - Bigram, Information Extraction, Dual-Tone Multi-Frequency Signaling

24 views1 pages
2 Oct 2012
of 1
Natural language processing
Two areas:
1.Comprehension: extracting info from language
2. Generation: process of conveying information using language
2 motivations:
Automated systems that perform language-related tasks like phone-operated
How language comprehension and generation occurs in humans
3 methods used in natural language processing:
Statistical methods: uses large corpora of data to compute statistical properties such
as word-occurrence and sequence formation.
A bigram statistic captures the probability of a word with certain properties
following another word with other properties
It can identify speech labels like nouns and verbs.
They are used for semantic disambiguation and structural disambiguation.
Structural/pattern-based method: they define structural properties of language like
formal grammars.
They use lexical, syntactic and semantic information that match sentence fragments.
They provide structural models that help with detailed analysis of linguistic
phenomena. But they can’t always rely on automatic training and hence, they have
to look at hand constructed rules.
Reasoning-based approach: encode knowledge and reasoning processes and use
these to interpret language.
Interpretation of language highly depends on the context in which language
Advantage is that it provides a mechanism for contextual interpretation of language.
Disadvantage is the complexity of models requires defining a conversational agent.
Information extraction and retrieval: analyze information automatically and develop
methods of retrieval.
Example is web pages that help us get information.
Machine translation: automatic translation of text and speech.
They provide automated dictionary/translation aids
Human-machine interfaces: Prime area for commercial application where touch-
tone interfaces are replaced with speech-driven language interfaces.