RSM412H1 Lecture Notes - Lecture 4: Linear Regression, Feature Engineering, Step Function
Document Summary
Need to scale parameters or else results will be overfitted. Many test statistics (t-tests, confidence intervals) that help interpretation. For qualitative variables, must be converted to dummy variable. If qualitative variables have many classes, then will have a lot of dummy variables which will make model very large. Mars: automatically creates piecewise linear model, captures nonlinearity in relationships in data. Assess cutpoints (knots) similar to step functions. Each segment fitted to different line: knots occur where two different linear relationships produce smallest amount of error. Can sequentially remove knots that do not contribute significantly to predictive accuracy: as number of knots increases, may not generalize well to new, unseen data once full set of knots have been identified, advantages of mars: Can naturally handle mix of quantitative and qualitative predictors. Highly correlated predictors not as much in issue as in ols: no feature scaling. Select predictor with minimum mse based on all available data.