STAT 2060 Lecture Notes - Lecture 24: Shot Put, Summary Statistics, Informa
286 views6 pages
STAT*2060: Statistics for Business Decisions
Data Analysis Project #2
Important information and instructions
•The deadline for this project is Friday, November 24 at 11:59pm. Late submissions are not
accepted, and will receive a grade of 0.
•There are three questions; you must complete all of them. The breakdown of marks for each
question is as follows:
–Question 1: 5 marks
–Question 2: 10 marks
–Question 3: 10 marks
•This project is worth 6% of your ﬁnal grade. You will be assessed on:
–getting the proper Excel output and plots (note: you must use Excel for this project!)
–validity of your statistical conclusions and interpretations
–writing style, including spelling and grammar
•Final report formatting and submission instructions are found at the end of this document. Please
read this information carefully! Projects that do not follow the correct format, or that are not
submitted properly, will not receive full marks.
1. The distances (in metres) of throws by 20 randomly selected senior male athletes who competed in
the shot-put in in June 1991 are recorded in the ﬁle Shotput Data.xlsx, available on Courselink.
Download this data and use it to conduct the following analysis:
(a) Create a histogram of the data, using the instructions found in Data Analysis Project #1.
Be sure to include appropriate an appropriate title and axis label on your plot. Comment
on the shape of the data. Your comments can include things such as symmetry, presence of
outliers, etc. Include both the histogram and your commentary in your ﬁnal report.
(b) Use Excel to calculate the summary statistics of mean and standard deviation for the data
set. Include these values in your ﬁnal report. You are not required to show formulas, only
the ﬁnal values.
(c) Calculate a 95% conﬁdence interval for the true mean distance of shot-put throws by senior
male athletes . What assumptions are required for the conclusions from this procedure to be
valid? Was it reasonable for you to make these assumptions? Include the conﬁdence interval,
an appropriate interpretation, and any comments on assumptions in your ﬁnal report. You
need to show enough work that someone reading your report can reasonably follow your train
(d) The standard deviation you calculated previously is for data sampled in 1991. Let’s assume
we can take this standard deviation to be the population standard deviation of distances of
shot-put throws by male athletes competing today (ignoring any eﬀect of improved training,
equipment, etc.). Suppose you wanted to estimate the mean distance of shot-put throws by
male athletes, with a margin of error of 0.25 metres and 95% conﬁdence. How many athletes
would you have to sample? You need to show enough work that someone reading your report
can reasonably follow your train of thought.
2. This question is based on data found in the article:
Ondogan, Ziynet; Pamuk, Oktay; Ondogan, Ece Nuket; and Ozguney, Arif. (2005). Improving
the appearance of all textile products from clothing to home textile using laser technology. Optics
& Laser Technology 37: 631 - 637.
You will need to obtain a copy of this article through the University of Guelph Library. The data
from this article can be found in the ﬁle BlueJeans Data.csv, available on Courselink. Download
this ﬁle, and perform the following analysis. (Note: you may wish to ﬁrst save the ﬁle as a .xlsx
(a) Review the article so that you are familiar with the experiment that was conducted.
(b) First, you will need to re-organize the data set so that it appears like Table 1 from the paper.
To do this:
•Cut all the rows/columns that correspond to sampleid equal to 2, and paste them beside
the rows/columns that correspond to sampleid equals 1. This essentially means you are
cutting rows #42 - #81 inclusive, and pasting them starting in cell F2.
•Repeat this for all the rows/columns corresponding to sampleid equals 3.
•Cut and paste the column labels found in Row 1, A - E, to columns F - J and again from
K - O.
•You can then delete the repeat columns of method,jeanid and sampleid found in
Columns F - H and K - M.