STAT 100 Lecture Notes - Lecture 4: Xml, Json, Linear Algebra
Data cleaning and exploratory data analysis
Data frame :
series
:Anamed icolumn of data With an index
indexes :mapping from key to
rows
Dataram :collection of Series with common index
Methods .
.
Filtering on predicts and slicing
df .Ioc :Location by index
df .iloc =location by integer address
group by and pivot
Data cleaning :process of transforming raw
data to facilitate subsequent analysis
Exploratory path Analysis CEPA )
The process of transforming ,visualizing ,and
summarizing data to -
Build confirm understanding of the data
identify potential issue
(inform subsequent analysis
✓werid pattern Leven if they don't exist )
conclusion ,subsets
Discover potential hypothesis
Key data properties to consider in EDA
-structure :shape
-Granularity =how fine
-scope :how Lin )complete is data