CISC483 Lecture 8: Discretizing Methods (Entropy & Chi-Squared)

20 views2 pages

Document Summary

Entropy-based discretization use entropy to split numeric valued attributes into intervals work top-down, starting with the whole interval, identifying a split point, and then recursively deciding whether to split the interval further. Find the weighted entropy for all possible split points. Choose the division where the change in entropy is greatest (smallest weighted entropy) the smallest weighted entropy will never be between 2 like values. The split will be at the midpoint between the two values. Keep doing this until you reach a stopping criteria. Minimum distance length (mdl) principle we won"t be discussing or tested on this likely to produce a discretization useful for learning (since it uses class information) one of the best supervised discretization techniques time consuming. Method start with n intervals every value is its own individual interval at the start. Find chi-squared for every possible pair of intervals (all values when starting)

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers