CS235 Lecture Notes - Headache, Central Tendency, Multivariate Analysis
19 views3 pages
Formulating a Hypothesis
- Seale (p. 344-347)
- Hypothesis is made up of a:
- Dependent variable
- Independent variable
- Intervening variable
- Are the product of the concept-indicator link
- Values of an indicator vary according to case
- Univariate analysis – study one variable
- Ex. Gender: how many people are male? (the analysis is only of one variable)
- We usually have more than one variable.
- Bivariate analysis: two variables
- Multivariate analysis: more than two variables (like your content analyses)
- Ex. Gender relating to laptop use.
Gender and laptop
- Proper use of the laptop during class time is determined by the gender of the
- You need to make a choice and then make a statement as to which variable is
independent and which is dependent.
- Our statement says that gender is the independent variable and laptop use is
the dependent variable
- Concept: 1. Gender 2. Laptop use
- Indicators: 1. Male and female and 2. What sites are you on during class?
How often are you on the WebCT site? How much use is course content?
- You may have an intervening variable such as the colour of the laptop.
- You may find that there is a positive relationship between gender and laptop
- Females use their laptops more appropriately than males
- The higher the female population the more appropriate the laptop use
- Don’t worry about the findings, just make sure to know dependent,
independent, and intervening variables make up a hypothesis.
Measure of central tendency
Mean – average of the distribution of the variable
Median – number positioned in the middle of the distribution
Mode – most frequent occurring value in a distribution
Standard deviation – how far the data deviate from the mean
Normal distribution - mean, median, and mode are all the same
Shape of Distribution
- Distribution can be either symmetrical or skewed, depending on whether
there are more frequencies at one end of the distribution that the other.
- Positively skewed distributions: distributions which have few extremely high
values (mean>median) (average is greater than the number positioned in the
- Negatively skewed distributions: distributions which have few extremely low
values (Means<median) (average is less than the number positioned in the
- Personal income is frequently positively skewed
- There are fewer people with higher income
- Therefore studies on earnings often report the median
- The mean tends to overestimate the earnings of the most typical earner
- Mass of the distribution is to the right
- Mean is lower than the median which is lower than the mode
- Ex. Most students do well on exam and a few tank the exam. This brings the
average down, but it is only a few students, so the results are negatively
skewed for the class.
- Median is more suited to variables measured at ordinal level.
- Mode is used for nominal variables
- Just know the definitions of the measures of central tendency and be able to
explain positively and negatively skewed
Correlation vs. causation
- Just because you find a correlation/association between two variables, it
doesn’t mean that one causes the other
- Ex. Sleeping with one’s shoes on is strongly correlated/associated with
waking up with a headache.
- Therefore, sleeping with one’s shoes on causes a headache
- We have found a correlation, but in all likely hood sleeping with your shoes
on doesn’t cause a headache
- The more plausible explanation is that both are caused by a third factor,
- Every hypothesis sets out to make a causal statement, but the nature of your
study determines correlation and causation
- Correlation examines how strong the relationship between variables is
- Correlation can be determined between numerical variables
- Association can be determined between categorical variables
- (you can have only association between categorical and numerical variables)
- as the value of one variable increases the value of another variable also
increases, positive correlation or association
- as the value of one variable decreases the value of an other variable
- people with higher incomes also tend to have more years of education,
- as people’s happiness level increases so does their helpfulness, positive
- as the value of one variable increases the other decreases
- there is a negative correlation between television viewing hours and school
- as people’s happiness level decreases their helpfulness increases, negative
- what about causation?
- Causality is not really based on reasoning, but instead we can only perceive
correlation as causation
- Counterfactual dependence – rewind history and change a variable
- Fundamental problem of causal influence: it is impossible to directly observe
- Causation is inferred
Something we already discussed
Longer version of the passage
Given an article and asked to interpret the article
You will be told how to focus your answer