CISC483 Lecture Notes - Lecture 5: Data Mining, Dimensionality Reduction

31 views4 pages

Document Summary

Different departments use different conventions, time periods, primary keys, etc. Overlay data data from outside an organization needed for data mining and which must be integrated with organization data. Aggregation within data have to determine how multiple potential entries should be aggregated to provide instances for analysis individual long distance calls vs total customer minutes to different states or total customer minutes. Dimensionality reduction more attributes that can be effectively dealt with need to reduce the # of attributes that are considered. Denormalization process of taking two or more relations and joining them together into a single relation (today"s notes) Aarf files standard way of representing machine learning data sets as flat files. {} a form of inductive learning given a collection of examples of the form (x, f(x)), return a function h that approx f h is the hypothesis greedy, divide-and-conquer approach. Each internal node corresponds to a test of the values of an attribute.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers