Study Guides (238,529)
Canada (115,195)
Statistics (115)
STAB22H3 (93)

STAB22 Exam Fall 2007.pdf

23 Pages
Unlock Document

University of Toronto Scarborough
Olga Chilina

University of Toronto Scarborough STAB22 Final Examination December 2007 This examination is multiple choice. Ensure that you have a Scantron answer sheet and a #2 pencil, and com- plete the Scantron sheet according to the instructions (otherwise your exam may not be marked). For this examination, you are allowed two letter-sized sheet of notes (both sides), handwritten and prepared by you, a non-programmable, non-communicating calculator, and writing implements. If your answer is not included in the alternatives given, mark the answer that is most nearly correct from those alternatives. If you need paper for rough work, use the back of the sheets of this question paper. The question paper will be collected at the end of the examination, but any writing on it will not be read or marked. This examination has 23 numbered pages; before you start, check to see that you have all the pages. 1 1. One part of the stock market is called the \over-the-counter market". One way of measuring activity in a stock market is by the percentage of outstanding shares traded. On a particular day, the results for 40 shares were as shown in the stemplot below: Stem-and-leaf of C1 N = 40 Leaf Unit = 1.0 (22) 0 2222222222333333333333 18 0 444444455555 6 0 66777 1 0 1 1 1 1 1 1 1 1 1 1 1 2 1 2 2 What is the median of this distribution? (a) * 3 (b) 0.3 (c) 30 (d) 2.2 (e) 6 2. In the data of Question 1, how do the mean and median compare? (a) * The mean is bigger than the median (b) The median is bigger than the mean (c) The mean and median are about the same (d) It is impossible to compare the mean and median from this information. 3. The pie chart below displays the distribution of grades in a statistics course. Note that in this course, the performance of the students is graded into one of the ▯ve grades, A, B, C, D and F. 2 One (and only one) of the ▯ve bar charts blow was constructed from the same data set used to create the above pie chart. Which of the following bar charts represents the same data as the pie chart above? (a) This bar chart: (b) * This bar chart: 3 (c) This bar chart: (d) This bar chart: (e) This bar chart: 4 4. The MINITAB output below gives the stemplot of the GPAs of a group of 78 students in a class. Stem-and-Leaf Display: GPA Stem-and-leaf of GPA N = 78 Leaf Unit = 0.10 1 0 5 2 1 7 3 2 4 7 3 4689 11 4 0678 15 5 0259 22 6 0001249 (22) 7 1122344555566668888999 34 8 001111223378899 19 9 011133445555679 4 10 1577 This class actually had 82 students, but the GPAs of four of them were not available at the time, the above stemplot was constructed. Later, it was found out that the GPAs of these four students are 5.9, 7.1, 10.5, and 9.1. Which of the following numbers is the closest to the interquartile range (i.e. IQR) of the GPAs of all 82 students? (a) 2.0 (b) 2.3 5 (c) 2.6 (d) * 2.9 (e) 3.2 5. There are ▯ve children aged 3, 3, 4, 5 and 5 years in a room. If another 4-year-old child enters the room, what will happen to the mean and standard deviation of the ages of the children in the room? (a) The mean will stay the same but the standard deviation will increase. (b) * The mean will stay the same but the standard deviation will decrease. (c) The mean and standard deviation will both stay the same. (d) The mean and standard deviation will both decrease. (e) The mean and standard deviation will both increase. 6. The histogram given below shows the distribution of survival times in days of 89 guinea pigs after they were injected with an experimental substance in a medical experiment. Which of the following statements regarding the median survival time is true? (a) The median survival time is less than 50 days. (b) The median survival time is greater than 50 days but less than (or equal to) 95 days (c) * The median survival time is greater than 95 days but less than (or equal to) 155 days (d) The median survival time is greater than 155 days but less than (or equal to) 300 days (e) The median survival time is greater than 300 days 6 7. A supermarket chain studied the times required to service customers. The values, in minutes, are shown in the boxplot below. What is the inter-quartile range of service times? (a) * 1.1 minutes (b) 0.4 minutes (c) 0.7 minutes (d) 3 minutes (e) greater than 5 minutes 8. Gasoline use for compact cars sold in the United States has a normal distribution with mean 32 miles per gallon, and SD 5 miles per gallon. What proportion of compact cars obtain 40 miles per gallon or higher? (a) * 0.05 (b) 0.95 (c) 0.15 (d) 0.85 (e) 0.30 9. Consider again the situation of Question 8. When gasoline is scarce, there is a compet- itive advantage in developing a car that has better (higher) miles per gallon than 90% of the current compact car market. What must the gasoline consumption (in miles per gallon) be for this new car? (a) * 38.4 (b) 25.6 (c) 32.0 7 (d) 35.2 (e) 28.8 10. The time it takes (for any student) to complete a STAB22 ▯nal exam is a random variable having a normal distribution with mean 160 minutes and standard deviation 15 minutes. Anne, Bob, Clara and Dave are four friends writing this exam. What is the probability that at least one of them will complete the exam in less than 145 minutes? (Assume that their completion times are independent.) (a) Less than 0.01 (b) Between 0.01 and 0.35 (c) Between 0.35 and 0.45 (d) * Between 0.45 and 0.55 (e) Greater than 0.55 11. The \running economy" of a runner is the oxygen consumption when that runner runs at a standardized speed. It is believed that a runner’s ▯nishing time in a 10 km race will be related to the running economy. A scatterplot of running economy and 10 km ▯nishing time is shown below. It is proposed to ▯t a straight line to these data. Which comment below is most appropriate? (a) * The relationship is not very strong, but it appears roughly linear. (b) A straight line describes this relationship very well, and the correlation will be very high. (c) The relationship is clearly curved, and so ▯tting a straight line is a bad idea. (d) This residual plot clearly does not show a random pattern, and therefore a straight line should not be used for these data. 8 (e) We cannot use regression techniques because of the outliers. 12. In the situation described in Question 11, you are asked to predict the ▯nish time for a runner with running economy 40. What is your reaction to this request? (a) * A running economy value of 40 is very untypical for these data, and so the ▯nishing time may not be predicted accurately. (b) Because a straight line describes the data, it is appropriate to use the regression equation to do this prediction. (c) The average ▯nishing time is 33.8 minutes, and so this is our best guess at the ▯nishing time for this runner. (d) A straight line is not appropriate here, so we cannot produce a prediction at all. 13. A study was made of the infestation of a certain type of lobster by two di▯erent types of barnacle, \tridens" and \lowei". It is believed that these two types of barnacle compete for space on the surface of the lobster. A sample of 10 lobsters is taken, and the number of barnacles of each type on each lobster is counted. A scatterplot of the results is shown below. What can you say about the correlation between the number of tridens and the number of lowei? (a) * It is de▯nitely negative. (b) It is de▯nitely positive. (c) It is close to zero. (d) It is close to +1. (e) It is close to ▯1. 14. Wood scientists are interested in replacing solid-wood building material by less expen- sive products made from wood akes. The following Minitab outputs were obtained from a study of the relationship between the length (in inches) and the strength (in pounds per square inch) of beams made from wood akes. 9 Descriptive Statistics: Length, Strength Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Length 10 0 9.500 0.957 3.028 5.000 6.750 9.500 12.250 Strength 10 0 291.3 22.5 71.2 234.0 242.8 251.5 343.3 Regression Analysis: Strength versus Length The regression equation is Strength = 488 - 20.7 Length Predictor Coef SE Coef T P Constant 488.38 38.85 12.57 0.000 Length -20.745 3.914 -5.30 0.001 Which of the following numbers is the closest to the value of the correlation between the length and strength? (a) 0.8 (b) {0.6 (c) {0.7 (d) {0.8 (e) * {0.9 15. In the study of the relationship between the length and the strength in Question 14 above, if we change the units of length to centimeters and the units of strength to kilograms per square centimeter, what will be the slope of the least squares regression line for predicting strength (in kilograms per square centimeter) from length (in cm)? Choose the closest answer from the alternatives below. (Assume 1 inch = 2.54 cm 1 pound per square inch = 0.07 kg per square centimeter) (a) *{0.5 (b) {1.5 (c) {8.0 (d) {20.7 (e) {53.2 16. A study was carried out to assess the e▯ects of sleep deprivation on subjects’ ability to perform simple tasks. Each subject was deprived of a pre-determined amount of sleep between 8 and 24 hours, and then given a standard set of addition problems to solve; the number of errors was recorded. A regression was carried out to predict the number of errors from the number of hours of sleep deprivation, as shown below. 10 The regression equation is errors = 3.00 + 0.475 hours Predictor Coef SE Coef T P Constant 3.000 2.127 1.41 0.196 hours 0.4750 0.1253 3.79 0.005 What is the predicted number of errors for a subject who is deprived of 20 hours of sleep? (a) * 12.5 errors (b) 3.0 errors (c) 20.0 errors (d) 8.0 errors (e) cannot predict because we are extrapolating 17. The following MINITAB output was obtained from a study of the relationship between the salary (in thousands of dollars) and length of service (in years) based on a random sample of 25 employees from a large ▯rm. Regression Analysis: Salary versus Length The regression equation is Salary = 20.3 + 0.915 Length Predictor Coef SE Coef T P Constant 20.332 1.851 10.98 0.000 Length 0.9153 0.1767 5.18 0.000 S = 3.44985 R-Sq = 53.8% R-Sq(adj) = 51.8% 11 Based on the above, which one of the following statements is true? (a) The above regression model explains more than 75% of the variability in salaries of these employees. (b) The distribution of residuals is left skewed. (c) The residual plots above, clearly indicates the need for a higher order model (d) * The absolute value of the residual for an observation in this sample with Length = 16 and Salary = 41, is greater than 5.0 (e) The correlation between Length and Salary is 0.538 18. Using the information in Question 17 above, if the average length of service of the 25 employees is 9.72 years, what is the average salary of these 25 employees? (a) it will be less than $21 000 (b) * it will be greater than $21 000 but less than $31 000 (c) it will be greater than $31 000 but less than $41 000 12 (d) it will be greater than $41 000 (e) it cannot be determined from the information given. 19. The regression analysis in Question 17 above is based on 25 observations. The residuals for these 25 observations were calculated (not shown in the output). The sum of 20 of these values (i.e. 20 residuals) was ▯6:08. What will be the sum of the remaining 5 residuals? (You may assume that the answer is not exactly equal to any of ▯10;▯5;0;5 or 10.) (a) between ▯10 and ▯5 (b) between ▯5 and 0 (c) between 0 and 5 (d) * between 5 and 10 (e) none of the above intervals contain this value 20. A plot is made of the residuals from a regression against the explanatory variables. In order for you to conclude that the regression is satisfactory, what are you looking for on the residual plot? (a) * no pattern at all (b) a straight-line pattern (c) some pattern, but not necessarily a straight line (d) no outliers 21. A residual plot was made as described in Question 20. In the residual plot, which of the following would indicate that a straight-line relationship is not satisfactory in the regression? ▯ * a curved pattern ▯ no pattern at all ▯ a \fan-out" pattern with larger values of the explanatory variable going with large positive or large negative residuals ▯ some pattern not given above 22. An researcher believes that a certain meditation technique lowers people’s anxiety level. The researcher collects a random sample of subjects and divides them, at random, into two groups. The subjects in group 1 are taught how to meditate, and are given frequent practice in meditating. The subjects in group 2 spent an equivalent amount of time in quiet relaxation. At the end of the study, all the subjects are given a standard test of their anxiety, and the subjects in group 1 had a statistically signi▯cantly lower anxiety level. What do you conclude from this study? 13 (a) * This experiment provides convincing evidence that meditation causes lower anxiety. (b) This observational study suggests that meditation is associated with lower anxiety, but does not o▯er convincing evidence. (c) This study o▯ers anecdotal evidence only, and so is no proof of anything. (d) Group 2 in this study was unnecessary, because quiet relaxation is obviously associated with lower anxiety. 23. A survey is taken on a healthcare issue, where opinion is likely to be di▯erent between young and old people. A simple random sample is taken of young people and a separate simple random sample is taken of older people, and the responses to the survey question are combined. It is desired to make a con▯dence interval for the proportion of all people who agree with the statement made in the survey. Is it possible to use the methods of this course to construct the con▯dence interval? (a) * No, because this is a strati▯ed sample, and our methods only apply to simple random samples. (b) Yes, because random samples were taken. (c) Yes, because the samples don’t have to be random for our methods to be used. (d) No, because a systematic sample is used here, and our methods only apply to simple random samples. (e) No, because simple random samples were used here, and our methods apply to some other kind of sampling. 24. A multiple-choice exam consists of 30 multiple-choice questions. Each question has ▯ve possible responses, of which only one is the correct answer. Each correct response carries 5 marks and each incorrect response loses 1 mark (scores a negative mark). A question that is left unanswered automatically scores 0 marks. (Subtracting marks for incorrect responses is known as a \correction for guessing" and is designed to discourage test takers from choosing answers at random.). A totally unprepared student answers all 30 questions by just selecting one of the ▯ve answers at random. Find the mean of his total score on this exam. Choose the closest answer from the options below. (a) * 6 (b) 0 (c) 3 (d) 12 (e) ▯3 (KB: I did some serious editing here. (i) having integer numbers of points for correct and wrong answers makes the calculation a lot less prone to error; (ii) the question is di▯cult enough just ▯guring out the mean, and with two numbers, \closest" isn’t meaningful.) 14 25. What is the probability that the observed value of a binomial random variable with 5 trials and success probability 0.8 will be 4 or more? (a) * 0.74 (b) 0.26 (c) 0.007 (d) 0.90 (e) 0.41 26. What is the approximate probability that the observed value of a binomial random
More Less

Related notes for STAB22H3

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.