STAT C100 Lecture Notes - Lecture 3: Data Analysis, Granularity, Temporality

32 views2 pages
School
Department
Course
Professor

Document Summary

Grouping a series by multiple series: df["%"]. groupby([df["party"], df["result"]])---the index will be all the combination that occurs. Grouping a dat a frame by a series (in the process, it"s data frame) It will compute the mean for every single cloumn (mean of the year is useless, the string column will be dropped automatically ) Caution: when we are using the max function, the data will not be match because it takes the maximum. Data cleaning: the process of transforming raw data to facilitate subsequent analysis. The process of transforming, visualizing, and summarizing data to: Build and confirm understanding of the data and its provenance. Identify and address potential issues in the data. Temporality: how is the data situated in time. Faithfulness: how well does the data capture reality. Often data will reference other pieces of data. Primary key: the column or set of columns in a table that determine the values of the remaining columns.

Get access

Grade+
$40 USD/m
Billed monthly
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
10 Verified Answers
Class+
$30 USD/m
Billed monthly
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
7 Verified Answers

Related textbook solutions