MIS 373 Lecture Notes - Lecture 7: Recommender System, Unstructured Data, Crowdsourcing
Document Summary
I like to eat broccoli and bananas. food. I ate a banana and spinach smoothie for breakfast. food. Chinchillas and kittens are cute. cute pets. My sister adopted a kitten yesterday. cute pets. Look at this cute hamster munching on a piece of broccoli. cute pets & food. Topics are latent (hidden) they have to be discovered docs contain. Topic a : 30% broccoli, 15% bananas, 10% breakfast, 10% munching, . Topic b : 20% chinchillas, 20% kittens, 20% cute, 15% hamster, . Documents 1 and 2 : 100% topic a. Documents 3 and 4 : 100% topic b. Document 5 : 60% topic a, 40% topic b. For each document, the author decided on. The author generated each word in a document by: E. g. , a document may have 1/3 . Choosing a topic (according to the distribution above) Generate words (according to the topic"s distribution). Topic, generate the word broccoli with 30% probability,