MIS 0855 Lecture Notes - Lecture 2: George E. P. Box, Data Science, Decision Tree Learning

30 views2 pages
Science and Data Science: What is data science?
Compare it to the definition of science: knowledge about or study of the natural world
based on facts learned through experiments and observation
What makes knowledge actionable? Why is that a goal? How does big data facilitate
this?
o Actionable needs to project into the future, needs to be generalizable and
robust
First: Statistics
o What is statistics? Statistics studies data in terms of collection, analysis,
interpretation, presentation, and organization
o It helps us to answer these questions:
What patterns are there in my data?
What is the chance that an event will occur?
Which patterns are significant?
What is a high level summary of my data?
Now: Big Data & Machine Learning
o What is machine learning (ML)? ML gives computers the ability to learn
without being explicitly programmed
o A computer program is said to learn from experience E with respect to same task
T and some performance measure P, if its performance on T, as measured by P,
improves with experience E.
T: playing checkers
P: percentage of games won against an arbitrary opponent
E: playing practice games against itself
Statistics vs. ML (Breiman2001)
o Input x Nature Output y
o Why analyze data? To predict or extract information
o Statistics: input x linear reg, logistic reg, cox output y
o ML: input x unknowns output y
Figure out unknowns with Decision Trees or Neural Nets
The dangers of (big data) analytics
o It’s easy to fid hat’s ot really there
o The direction of causality can be tricky
o Dirty data is eeryhere
“o…“tart ith a hypothesis
o The testale preditios fro a idea ith a uderlyig ratioale that akes
sense
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows half of the first page of the document.
Unlock all 2 pages and 3 million more documents.

Already have an account? Log in

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents