Statistical Sciences
Statistical Sciences 2244A/B
Jennifer Waugh

1-1 Overview ● Data are observations (such as measurements, genders, survey responses) that have been collected ● Statistics is a collection of methods for planning experiments, obtaining data, and then organizing, summarizing, analyzing, interpreting, presenting, and drawing conclusions based on the data ● Apopulation is the complete collection of all elements (scores, people, measurements, and so on) to be studied; collection is complete in the sense that it includes all subjects to be studied ● Census is collection of data from every member of population ● Sample is subcollection of members selected from part of population ○ must be collected in appropriate ways; if it isn’t, data may be completely useless 1-2 Types of Data ● Parameter is measurement describing some characteristics of populations ● Statistic is measurement describing some characteristic of sample ● Quantitative data consists of numbers representing counts or measurements ○ Discrete data: number of possible values either finite number or countable number ○ Continuous data: infinitely many possible values that correspond to some continuous scale that covers a range of values w/o gaps ● Qualitative (or categorical or attribute) data can be separated into different categories that are distinguished by some non-numeric characteristic ● Common way of classifying data is to use four levels of measurement: nominal, ordinal, interval, and ratio: ○ Nominal measurements characterized by data consisting of names, labels or categories ■ should not be used for calculations as they lack any ordering or numerical significance ○ Ordinal data can be arranged in some order but differences between values are meaningless or cannot be determined ■ provides information about relative comparisons but not magnitudes of differences ○ Interval is like ordinal but the difference between data values is meaningful; however, data at this level don’t have natural zero starting point (where none of quantity present) ○ Ratio is same as interval but there is also natural zero starting point; differences and ratios are both meaningful ■ called ratio because zero starting point makes ratios meaningful 1-3 Design of Experiments ● Voluntary response sample (or self-selected sample) is where respondents themselves decide whether to be included ○ often happens that people with strong interests or opinions more likely to participate thus responses not representative ○ valid conclusions can only be made about specific group of people who chose to participate; should be used for making general statements ● Statistical methods driven by data, which is obtained from two sources: observational studies and experiments ● Observational Study: observe and measure specific characteristics but don’t attempt to modify subjects being studied ○ cross-sectional study: data measured at one point in time ○ retrospective (case-control) study: go back in time to collect data ○ prospective (longitudinal or cohort) study: go forward in time and observe groups sharing common factors such as smokers and nonsmokers ● Experiment: apply some treatment and proceed to observe it’s effects of subjects
