Textbook Notes (362,755)
Canada (158,052)
York University (12,350)
MATH 1131 (2)
Cindy Fu (1)
Chapter 2

Chapter 2- Statistics 1131.docx

9 Pages
Unlock Document

York University
Mathematics and Statistics
MATH 1131
Cindy Fu

Chapter 2 2.1  Definitions 1: -Univariate: a data set consisting of observation of only a single characteristic of the individuals/ objects. -Bivariate: a data set consisting of observations of two characteristics of the individuals/objects. -Multivariate: data set consisting of observations of more than two characteristics of the individuals/objects.  Definitions 2: -A categorical/qualitative data set: consists of non-numerical observations that may be placed in categories. -A numerical/quantitative data set: consist of observations that are numbers. Examples: 1-Sneaker preference A sample of 12 people is asked what their favorite brand of sneakers is. This is a qualitative data set, since the responses are either Nike, Adidas ...etc. It is non-numerical univariate study. 2-Egg Weights Suppose a new enzyme is tested and 20 eggs are randomly selected and weighted, to test the nutritional benefits of the enzyme. Then the resulting weights are recorded in a table. Since the observations are numerical then this is a univariate quantitative data set. Definitions 3: -Discrete data: is data that values are finite. It is recognized with the word counting. Ex: The number of lightning that hits Ontario in one day can be 5 but not 5.5, and you can count it. -Continuous data: is data that its values fall on an interval. It is recognized with the word measuring. Ex: Barometric pressure can be any value between 960 and 1070 mmHg. It can be 970.67. Categorical data Univariate data Discrete data Numerical data Continuous data Questions: classify the following as categorical or numerical. If numerical, then classify as discrete or continuous. 1-The number of books read by middle-school students during the academic year. You are counting the number of books. Number=numerical, and counting=discrete. 2-The length of time (in minutes) it takes to get a haircut. Time is numerical, but it is measured since you can take 15.4 minutes. Therefore, it is continuous. 3-The type of candy received at house on Halloween. Type= quality= categorical. Therefore, this is a categorical data set. 2.2 Definitions: -Frequency distribution for categorical data: is a summary table that presents categories, counts, and proportions. Refer to table 2.1, on page 22 on textbook. -Class: the label of each categorical data set. -Frequency: is the count for each glass. -Relative frequency/sample proportion: is the frequency of the class divide by the total number of observations. Examples: Class Frequency Relative frequency Bahamas 2 2/25=0.08 Bermuda 4 4/25=.16 Caribbean 6 6/25=.24 Mediterranean 3 3/25=.12 Southampton 10 10/25=.4 Total 25 1.00 1-What is the proportion of cruise ships that did not go to Southampton? .4 went to Southampton, so 1-0.4=0.6 did not go to Southampton. 2- Draw a bar graph for the above table. 12 10 8 Series 3 6 Column1 Frequency 4 2 0 Bahamas Bermuda Caribbean Mediterranean Southampton The key here is that the class is at the x-axis, and the frequency is at the y-axis. 3- Draw a Pie chart. Sales Bahamas Bermuda Caribbean Mediterrean Southampton You take the frequency of a class and multiply it by 360 to get the angle/size of a class. Each piece of the pie is a class. 2.3 Definitions: -Outliers: values that are very far from the rest. -Variability: refers to the spread or compactness (crowdedness together, little variability) of the data.  A stem-and leaf plot is a graphical procedure used to describe the shape, centre, and variability of the distribution of numerical data.  How to draw a stem-leaf graph. 520 52 0 Leaf 46 6 Stem 47 48 49 8 7 3 8 7 - 0 is placed in the 52 stem row. 50 2 8 2 6 4 1 5 1 - Data is organized so one digit 51 5 3 1 3 2 is on the right, and the rest is 52 0 2 5 3 7 3 on the left, but up to 2 digits 53 3 on left. If for example, you 54 0 8 8 4 have 502 and 503, then the 2 55 7 6 7 and the 3 are in the same stem 56 7 4 row. 57 0 0 2 0 4 - Notice that the 2 digit 58 9 5 numbers are organized in 59 8 7 6 increasing order. 60 1 9 4 4 - The centre of data=typical 61 2 value=value in the middle=where the data is clustered is 52 or 53 here. - The outlying value here is 466 since it is far from the data cluster. - Data can be referred to as variable (spread) or outlier. Real World Analogy: -In
More Less

Related notes for MATH 1131

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.