STAT C100 Lecture Notes - Lecture 13: Feature Engineering, Overfitting, Invertible Matrix

63 views8 pages
13 Oct 2018
School
Department
Course
Professor

Document Summary

Fitting linear models, regularization and cross validation (domain) feature engineering linear regression. Turn into the feature matrix with entirely quantitative values. Note: for inverse to exist needs to be full column rank. Scikit learn has a wide range of models. Many of the models follow a common pattern: from sklearn import linear_model f = linear_model. linearregression(fit_intercept=true) f. fit(train_data[["x"]], train_data["y"]) How can we control overfitting through . Proposal: set weights = 0 to remove features. Does not encourage sparsity small but non-zero weights. Does not have an analytic solution numerical methods. Which means that complexity(f) less and equal to beta (regularization parameter) such that f(x) not too complicated. Non-convex hard to solve constrained optimization problem. There is an equivalent unconstrained formulation (obtained by lagrangian such that duality) Larger values more regularization more bias less variance. Larger less regularization greater complexity overfitting. Training error might be small but test error large failure to generalize. Larger training set more complex models.

Get access

Grade+
$40 USD/m
Billed monthly
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
10 Verified Answers
Class+
$30 USD/m
Billed monthly
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
7 Verified Answers

Related textbook solutions

Related Documents