CS100 Lecture Notes - Lecture 10: Polysemy, Financial Institution
Document Summary
What is a graph : some nodes connected by edges, can use a graph to represent physical and virtual systems (ex. Indexing: behind the sciences: list of occurrences building a list of every page that has a specific word and where it is, punctuation & hyphens (ex. Email vs e-mail) index has to know that these mean the same thing: stop words don"t have to a create an index for words like the , word variants (ex. Sell, sells, selling) treat as one thing: spelling variants (ex. Color vs. colour) treat as the same thing. Advanced indexing: semantics: synonymy (ex. big & large) treat as the same thing, polysemy . Evil spiders: scraping & stealing content, stealing email addresses. Phrases with more than 1 word: search engine would look for all the words separately and see which page it occurs on the most, give you result that is meaningful to you, page ranking: