POLI 210 Lecture Notes - Lecture 20: Named-Entity Recognition

22 views2 pages

Document Summary

Poli 210- lecture 20: content analysis ii, prof. aaron erlich. Often, we want to discover meaning from texts. The internet and other sources have led to astronomical amount of text. We want to turn that text into data. Most techniques now use a combination of approaches (supervised, unsupervised, dictionary, regression) It depends on the question you are asking. We are going to look at an application to get a better sense of what this looks like. We wa(cid:374)t to ide(cid:374)tify ele(cid:373)e(cid:374)ts of do(cid:374)ald tru(cid:373)p"s spee(cid:272)h. The problem is there is also a campaign. Need to pre-process the data to strip out unuseful information and standardize text. Many dictionary approaches also have to do some of these steps. Bag of words: get rid of structure and just use the words. N-grams (bi-grams, tri-grams: use combinations of words. Named entity recognition: use powerful machine learning algorithms to extract named entities. Term frequency/ term frequency inverse document frequency.

Get access

Grade+
$40 USD/m
Billed monthly
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
10 Verified Answers
Class+
$30 USD/m
Billed monthly
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
7 Verified Answers

Related Documents