School

University of AlbertaDepartment

Biology (Biological Sciences)Course Code

BIOL208Professor

Jessamyn MansonStudy Guide

FinalThis

**preview**shows pages 1-3. to view the full**9 pages of the document.**LAB 1: Statistical analyses of sampling data

Intro: The Scientific Method

•Systematic method of inquiry

•Involves observations, development of hypotheses, collection of empirical data,

and the testing of hypotheses.

Hypothesis development

•Hypothesis: supposition to explain an observed phenomenon.

•Must be falsifiable (can be refuted).

Different types of hypotheses

•Working hypothesis: used as part of the scientific method to execute experiments

and then examine the results. It is provisionally accepted that is used for further

research.

•Null hypothesis (AKA Statistical hypothesis): the statement that the phenomenons

that you were examining are NOT related; results may be the product of random

chance events.

Predictions

•Make a prediction and then test it.

•Your degree of confidence about the difference between control and treated

groups depends on the: 1. Magnitude of difference between groups

2. Variability of the measurements within each group

Experimental and statistical design

•Samples: subsets of populations (statistical, not necessarily biological

populations) for which a generalization about some attribute is desired.

•Catch a fraction of the population, selected at random from all available

individuals, and extrapolate the measurements to the population as a whole.

Measures of central tendency and spread

•Frequency distribution can be graphed into a histogram.

•Mean (average)

•Mode (most common observation)

•Median (middle observation)

* If data fit perfect normal distribution, mean=mode=median

•Range (largest value – smallest value)

•Variance (s2)

* Samples from diff. populations ma have the same central tendencies,

but different variances.

•Standard Deviation (s)

Parametric statistics and the normal distribution

•Parametric statistics assumes that the frequency distribution of the

population/sample conforms to a bell-shaped (Gaussian) distribution

•68.3% of observations are within 1 SD of the mean, 95.4% are within 2 SD, and

99.7% are within 3 SD of the mean.

•This is assumed if it has a single mode, relatively symmetrical, and mean

~=median.

Only pages 1-3 are available for preview. Some parts have been intentionally blurred.

•Skewed distribution if it has one tail longer than the other; the mean moves away

from the median towards the long tail.

Standard error of the mean and confidence limits

•Confidence interval: the interval in which the true population lies

CI = mean ± t * sx

•Standard error of the mean (sx)

•Degrees of freedom = n-1

•t comes from the t-table

•α = 1 – degree of certainty (usually 0.95) so is usually 0.05.α

The t-test:

Used to test whether or not the difference between two populations is real (statistically

significant) or due to sampling (sampling error)

•Null hypothesis: there is no real difference between two populations

•P-value: the probability of obtaining your results due to chance alone if H0 is true.

•Display results using bar graph (standard error balls represent standard error of

the mean)

ANOVA (single factor) – Analysis of variance

Used to test null hypothesis that two or more samples are drawn from the same

population (each sample would be equal).

•F statistic: ratio of the variation between a group of means relative to the variation

within the groups

Sampling data types

•Measurement data: quantifies characteristics of a population (individual members

will possess). Can be continuous or discrete.

•Enumeration data: involves classifying the state of individuals in a population,

often require analysis of statistical techniques.

Chi-square goodness of fit test

Used to compare a given distribution of enumeration data with a theoretical/expected

distribution.

•If calculated value is larger than critical value = reject H0.

Correlation

Used to see if two factors are related to each other (whether or not they are correlated).

•Relationships can be cause-and-effect but we never know this or can assume this.

•Both variables may be responding to a common cause.

•The product-moment Correlation Coefficient (r) shows the strength of a linear

association. -1 < r < +1

r = 0, no correlation

r = -1 perfect negative correlation

r = +1 perfect positive correlation

Linear regression

Use to establish the form and significance of a functional relationship between two

variables. Objectively gives us a line that best fits the data.

Only pages 1-3 are available for preview. Some parts have been intentionally blurred.

•The coefficient of determination (r2) measures the strength of the relationship

(whether the data points fit closely to the line or if they deviate). 0 < r2 < 1. 1 =

strong linear relationship.

•The reliability of the equation is expressed by the probability of obtaining this

linear relationship if H0 were true.

Collecting data

•Data loggers for light and temperature

•Vernier calipers

LAB 2: Sampling, density estimation, and spatial relations

Introduction

•In order to generalize from a sample population, it must be representative:

1. Must be unbiased

2. Must be adequate in size

Choosing samples

•Random sample: every member of the population (every individual organism or

every point of ground) has an EQUAL and INDEPENDENT probability of being

included.

* Ensure randomness (avoid subconscious bias) – let chance

determine samples. Can use random numbers generated on a

calculator, computer, or a random numbers table.

•Systematic sample: some sort of systematic/regular arrangement when taking

samples. Usually simpler than random. Bias is only present if there is some sort of

pattern in the population.

•Selected samples: if you try to select only “representative” samples – may leave

out extreme conditions

•Random sampling > systematic sampling.

Adequacy of sampling

•Number and size of samples => accuracy and precision of the estimates.

•The larger the sampling size, the more likely it is to be adequate.

•A large number of small or medium sized samples > a small number of large size

samples.

•Simple, homogeneous area will require less sampling than a complex,

heterogeneous one.

Judging sampling adequacy:

1. Performance curves

•Plots the cumulative mean value of some trait against the number of samples.

•Cumulative mean: dividing the total number of objects encountered at a given

number of plots.

•Once the change in the mean becomes very small with the addition of another

sample, we assume that our sample mean = true population mean.

2. Two-step sampling

###### You're Reading a Preview

Unlock to view full version