INFO 154 Lecture Notes - Lecture 5: Signal-To-Noise Ratio, Overfitting, Linear Regression
Document Summary
Combine method of choosing a model is better than one alone. It"s more than the defined noise, based on the assumption: Error is uniformly distributed (even though it"s almost never case in ml classifiers) Combining predictions through simple average or other non-trainable combiner. Find the smallest rmse and choose that model. Method2 : bagging diversifying a model by bootstrapping the training set and taking the majority vote. Each model in the ensemble is trained on a bootstrapped sample of the data. When the label is a class, the majority vote is used instead of averaging. Each decision tree is trained on a bootstrapped sample of the data. Each decision tree is trained on a random subset of the features from the sample. Accuracy depends on the strength of independent classifiers and diversity in error among them. Benefit: fast, each tree is learned (in a parallel way )