SOCI 217 Lecture Notes - Lecture 21: Explained Variation, Null Hypothesis, Digital Footprint
Week 11- big data – TA lecture
What is big data
• Big data (McKinsey, 2011)
• Data so large (volume), complex (variety), and or variable
• Digitalization of social life and digital traces of human activities
• Meta data example
o Automated data collection
o API (application programming interface)
▪ Code to get the data
▪ Stream line way to get huge amounts of data
o Tweeting
Areas of CSS and BD
• 1. Automating data collection (scraping, harvesting, online extraction)
o Need a server to connect to data
o So much data, that we need new ways to collect it
• 2. Automate analysis and patterns discovery
o social network analysis
▪ twitter to study political mobilization
▪ study patterns of social inequality on Airbnb and kick-starters
is bigger always better
• 1. characteristics
o found data vs data derived under strict rules of a statistically designed
experiment
▪ pro
• no biases related to design, research and/or respondents
• big data easily meet the requirement of sample size requirement
• 2. Big data = big problems?
o Con
▪ Population data, big data set as census data
• E.g. robots that leaves comments
• E.g. over representation of certain user group
• Persons that have access to technological devices
find more resources at oneclass.com
find more resources at oneclass.com