Lecture 1

# BIOS 1500 Lecture 1: lecture_1_Introduction_2017

School
East Carolina University
Department
Biostatistics
Course
BIOS 1500
Professor
Kevin O'brien
Semester
Spring

Description
Lecture 1 Spring 2017 BIOS 1500 An Introduction to Biostatistics for the Health Sciences Lecture 1 Spring 2017 Text The text for this course is: The Practice of Statistics in the Life Sciences, 3 Edition Authors: By Bridget Baldi and David S. Moore Published by: W. H. Freeman and Company, New York, NY 2013 Alternative Text The text is not required, previous versions are ok. Another text you might read is : th Introduction to the Practice of Statistics: 6 Edition Authors: David Moore, George McCabe and Bruce Craig Published by W.H. Freeman and Company, New York, 2009. Introduction Definition: Statistics-The branch of mathematical science that is concerned with the collection, organization, and interpretation of numerical or alphabetically coded information. Note that within this definition the term interpretation includes making inferences from the observed data. Introduction The term data is a synonym for numerical or alphabetic coded information. Another term that implies numerical information is measurement. In general, data refers to a collection of measurements that often have a common relational component (being made on the same person(s), or object(s) for some specific health reason, research project etc.) Statistics and Uncertainty Typically one has a set of observations, a sample, that we believe somehow represent a larger group of observations (population). The goal being to infer from the observed data to the unobserved larger group. There is some uncertainty in the inference, since we only have a sample and not all possible data. Uncertainty is measured using the ideas of probability and variability. Statistics and Uncertainty Another source of uncertainty is in the measurement process for obtaining the data or observations. There is always some form of error in the measurements that adds to the uncertainty of the values. We will label this Errors of Measurement. This is often a small component and something that can be controlled in the research design. Statistics and Uncertainty Another way statisticians represent uncertainty is through the concept of variability. If you consider all the heights of persons in the class, not everyone is the same identical height. The heights vary among persons. This variation among persons is a source of uncertainty in our observations. This form of uncertainty is naturally occurring and refered to as natural variation or innate variation, as opposed to errors of measurement. Statistics and Uncertainty Statisticians use the Greek letter sigma to represent variability (uncertainty):  The overall uncertainty or variability is due to natural variation, sampling, and measurement. Though we often do not break out the components of variability it is good to keep them in mind. Statistics and Uncertainty The number of items in the sample is often denoted with a small letter n. This number, n, is a measure of say raw information we have in a study. An adjusted form of statistical information in the sample is denoted Statistics and Uncertainty The information is the ratio of the number of observations relative to the uncertainty in the data. The inverse of the information is useful as well as it indicates how we can control the level of uncertainty by increasing the sample size. Biostatistics Biostatistics is a specialized discipline of statistics that deals with statistical applications in the biological and health sciences. The design of health surveys, clinical trials, vital statistics, cancer survivorship studies and biological field studies are some specific biostatistical applications. Collection of Information Collection deals with exactly what you would think—how the data are obtained. Methods of Sampling and Research Designs are aspects of collection. These often involve randomization as in ‘randomized clinical trials’, or Random sampling (aka scientific sampling) Organization and Presentation of Data Organization-This aspect may refer to how the data are stored in a database. However it mostly refers to presentation of the data, so as to efficiently and effectively bring out the information content in the data. This area of statistics is often referred to as descriptive statistics. It also includes aspects of data management and coding. Inference from Data Inference is making a generalization from a few specific measurements (sample) to a larger set of measurements (population). It is a statement that goes beyond the given data. Hence uncertainty. You observe 10 brown cows and infer all cows are brown or at least the majority of them are brown. This goes beyond the 10 cows of the sample to talking about all cows. Inference It is the nature of most problems that not every item or person in the world can be measured for a study. The solution is typically to observe a sample or subset of all possible persons or items. From this subset an inference will be made to the entire collection or population of persons or items. Inference tied to Collection How the sample is selected is one of the key aspects regarding the statistical coll
