false

Study Guides
(248,454)

Canada
(121,548)

Simon Fraser University
(3,618)

Statistics
(35)

STAT 101
(3)

English Lady
(1)

Final

Unlock Document

Statistics

STAT 101

English Lady

Fall

Description

1. Categorical variable: placed an individual categories. Ex: male or female
Quantitative variable: takes numerical values for which arithmetic operations such as adding and averaging makes sense. Ex: height in cm,
money in $
Distribution: tells us what value a variable takes and how often it takes theses values. Pie charts, bar graphs – distribution of categorical
variable. Histograms, stem plots – distribution of quantitative variable.
Overall pattern: look for deviations. Shape: if the distribution has a single peak and if its skewed to the right or left. Centre: show the midpoint
of the distribution. Spread: the minimum to the maximum.
Variance: find the mean, the four deviations, the squared deviations, divide by (n-1)
2. Mean: all values and than divide by the number of observations. Mean of population = u, mean of sample = x_(over top)
Median: midwastpoint of distrdbution (n+1/2) st rd
Quartiles: 1 at ¼ mark; 3 at ¾ mark. Five-number summary: 1=min; 2=1 Q; 3=M, 4=3 Q, 5=max
Interquartile range (IQR): distance between Q1 and Q3. (Q3-Q1=IQR) IQR*1.5 = OUTLIERS
A resistant measure: unaffected by changes in numerical value. Ex: median, quartiles. Mean, standard deviation --- NOT!
3. Density curve: Must be 1, always on or above the horizontal axis, describes the overall pattern of a distribution. Z-scores: z = (x-u)/o
u = mean, o = standard deviation (1-z) area right
x = μ + z σ – if we do not know X use Table A to find z (percentages) –in between two numbers
68-95-99.7 RULE: 68%: within standard deviation, o, of the mean, u; 95%: within 2o of u; 99.7%: within 3o
4. Response variable: measures an outcome of a study. Y Explanatory variable: explain or influence change in a response variable. X
Scatterplot: shows relationship between two quantitative variables.
Direction: Positively associated: two variables above-average together or below-average together.
Negatively associated: two variables one low, one high.
Correlation: measures the direction and strength of the linear relationship between two quantitative variables, r. (-11). Perfect correlation:
+1, LINEAR RELATIONSHIP, two components to a correlation: the sign and the number. The sign indicates the direction (positive or negative)
of the relationship and the number indicates the strength of the relation. r2 measures only strength in a relationship Form: Linear
relationships: points show a straight-line pattern. Also, curved relationships and clusters.
Strength: how close the points in the scatterplot lie to a simple form such as a line.
5. Regression line: a straight line that describes how a response variable, y, changes an explanatory variable, x. Y = A+BX. B = slope, A =
intercept. Use to predict y-x b = r (sy/sx) equation: linreg* y = b+a(x)
Least-squares regression line: y on x, sum of vertical distances of data points from the line as small as possible. Residual: observed y –
predicted y =; mean = 0 for least-square regression line.
Extrapolation: the use of a regression line to predict outside of range values of the X.
Lurking variable: a variable that is not explanatory or response variable may influence the interpretation of relationships.
6. Marginal distribution: one of the categorical variables in a two-way table of counts is the distribution of values of that variable among all
individuals described by the table. The row totals and column totals in a two-way table give the marginal distribution of the two variables.
Conditional distribution: will show how one variable behaves for different values of the second. It is this distribution that lets us see
relationships in categorical variables.
Joint distribution: The joint distribution is what is given in the body of a two-way table. It may appear to show a relationship, but if the numbers
are different across categories, this can be misleading.
Simpson’s paradox: breaks the individuals into groups. Ex: victims are classified as injured in a serious accident or les serious accident.
8. Sampling design: describes exactly how to choose a sample from the population.
Voluntary response sample: people who choose themselves by responding to general appeal – biased.
Simple random sample (SRS): n individuals of a chosen population in a way that every n individuals gets a chance to sample.
Stratified random sample: classify the population into groups of similar individuals, strata. A separate SRS in each stratum and combine SRS
to form full sample.
9. Observational study: observes individuals and measures variables of interest, but does not influence the responses. Study used to describe
some group or situation. Experiment: Imposes treatment on individuals to observe their response. Matched pairs = exactly. Confounding: two
variables are confounded when their effects on a response variable cannot be distinguished from each other.
Factors: explanatory variable in an experiment. Treatment: experimental condition applied to subjects. Sample: pop. we collect info
Randomized comparative experiment: an experiment that uses comparison of two or more treatments on random subjects. Completely
randomized experiment: all subjects are allocated at random among all the treatments. Double blind: experiment- the subjects don’t know
which treatment their receiving. Block: a group of individuals before an experiment that are similar in some way.
Block design: random assignment of individuals to treatments is carried out separately within each block.
10. Sample Space S: The sample space is the set of all possible outcomes. S = {apple, orange, banana, pear, peach, plum, mango}
Probability model: mathematical description of a random phenomenon.
Discrete: a probability model with a finite sample space.
Continuous probability model: assigns probabilities as areas under a density curve.
Random variable: a variable whose value is a numerical outcome of a random phenomenon.
Probability distribution: random variable X tells us what values X can take.
Personal probability: a number between 0 and 1 that expresses an individual’s judgment of how likely the outcome is.
11. Parameter: a number that describes the population. In statistical practice, the value of the parameter is not known because we cannot
examine the whole population. Statistic: describes a sample, a number that can be computed from the sample data without making use of any
known parameters. In practice, we often use a statistic to estimate an unknown parameter.
Population distribution: of a variable is the distribution of the values of the variable among all the individuals in the population.
Sampling distribution: if a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same
population. Mean

More
Less
Related notes for STAT 101

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.