# CS235 Lecture Notes - Headache, Central Tendency, Multivariate Analysis

19 views3 pages

Formulating a Hypothesis

- Seale (p. 344-347)

- Hypothesis is made up of a:

- Dependent variable

- Independent variable

- Intervening variable

Variables

- Are the product of the concept-indicator link

- Values of an indicator vary according to case

- Univariate analysis – study one variable

- Ex. Gender: how many people are male? (the analysis is only of one variable)

- We usually have more than one variable.

- Bivariate analysis: two variables

- Multivariate analysis: more than two variables (like your content analyses)

- Ex. Gender relating to laptop use.

Gender and laptop

- Proper use of the laptop during class time is determined by the gender of the

user.

- You need to make a choice and then make a statement as to which variable is

independent and which is dependent.

- Our statement says that gender is the independent variable and laptop use is

the dependent variable

- Concept: 1. Gender 2. Laptop use

- Indicators: 1. Male and female and 2. What sites are you on during class?

How often are you on the WebCT site? How much use is course content?

- You may have an intervening variable such as the colour of the laptop.

- You may find that there is a positive relationship between gender and laptop

use

- Females use their laptops more appropriately than males

- The higher the female population the more appropriate the laptop use

- Don’t worry about the findings, just make sure to know dependent,

independent, and intervening variables make up a hypothesis.

Measure of central tendency

Mean – average of the distribution of the variable

Median – number positioned in the middle of the distribution

Mode – most frequent occurring value in a distribution

Standard deviation – how far the data deviate from the mean

Normal distribution - mean, median, and mode are all the same

Shape of Distribution

- Distribution can be either symmetrical or skewed, depending on whether

there are more frequencies at one end of the distribution that the other.

Skewed Distribution

- Positively skewed distributions: distributions which have few extremely high

values (mean>median) (average is greater than the number positioned in the

middle)

- Negatively skewed distributions: distributions which have few extremely low

values (Means<median) (average is less than the number positioned in the

middle)

Positively Skewed

- Personal income is frequently positively skewed

- There are fewer people with higher income

- Therefore studies on earnings often report the median

- The mean tends to overestimate the earnings of the most typical earner

Negatively Skewed

- Mass of the distribution is to the right

- Mean is lower than the median which is lower than the mode

- Ex. Most students do well on exam and a few tank the exam. This brings the

average down, but it is only a few students, so the results are negatively

skewed for the class.

Picking One

- Median is more suited to variables measured at ordinal level.

- Mode is used for nominal variables

- Just know the definitions of the measures of central tendency and be able to

explain positively and negatively skewed

Correlation vs. causation

- Just because you find a correlation/association between two variables, it

doesn’t mean that one causes the other

- Ex. Sleeping with one’s shoes on is strongly correlated/associated with

waking up with a headache.

- Therefore, sleeping with one’s shoes on causes a headache

- We have found a correlation, but in all likely hood sleeping with your shoes

on doesn’t cause a headache

- The more plausible explanation is that both are caused by a third factor,

alcohol intoxication

- Every hypothesis sets out to make a causal statement, but the nature of your

study determines correlation and causation

- Correlation examines how strong the relationship between variables is

- Correlation can be determined between numerical variables

- Association can be determined between categorical variables

- (you can have only association between categorical and numerical variables)

- as the value of one variable increases the value of another variable also

increases, positive correlation or association

- as the value of one variable decreases the value of an other variable

decreases

- people with higher incomes also tend to have more years of education,

positive correlation

- as people’s happiness level increases so does their helpfulness, positive

association

- as the value of one variable increases the other decreases

- there is a negative correlation between television viewing hours and school

grades

- as people’s happiness level decreases their helpfulness increases, negative

association

- what about causation?

- Causality is not really based on reasoning, but instead we can only perceive

correlation as causation

- Counterfactual dependence – rewind history and change a variable

- Fundamental problem of causal influence: it is impossible to directly observe

causal effects

- Causation is inferred

Midterm

Cohesive essay

Something we already discussed

Longer version of the passage

Given an article and asked to interpret the article

You will be told how to focus your answer

Cohesive essays