Illinois State University Mathematics Department
MAT 312: Probability and Statistics for Middle School Teachers Dr. Roger Day (day@ilstu.edu) 
Test #2 Possible Solutions 
Scoring
 Part I: 25 Multiple Choice Questions (1 pt each)
 Part II: 2 OpenResponse Questions (25 pts total)
 Total: 50 points
 Impact on Course Grade: 20% of your Grade
Criteria Used to Evaluate Part II Responses
26: 15 points
 a) 2 pts: accurate scatter plot
 b) 3 pts: correct medianmedian points, identified on scatter plot and listed as ordered pairs
 c) 6 pts
 (i) and (ii): 2 pts each: accurate interpretation, clearly expressed
 (iii): 2 pts: correct numerical response to nearest tenth of a pound
 d) 2 pts: correct leastsquares equation
 e) 2 pts: correct numerical response to nearest hundredth
27: 10 points
 a) 2 pts: correct equation
 b) 1 pt: correct statement of the correlation coefficient, precisely as shown on calculator screen
 c) 1 pt: correct SSE, rounded to nearest hundredth of a unit
 d) 2 pts: accurate residual plot, including indication of scales
 e) 4 pts: comprehensive and accurate report of your exploration, with reference to at least two additional models and appropriate use of criteria for judging goodness of fit
Part I: Multiple Choice
For each question, choose the one best response and circle that letter at the appropriate spot on the answer sheet.
1.The scatter plot to the right shows _?_ relationship. Assume that vertical and horizontal axes are identically scaled.
 a. a strong negative
 b. a strong positive
 c. a weak negative
 d. a weak positive
2.The visual representation shown here helps describe the relationship between direct current electrical output from a wind power generator and wind speed. The plot provides information about the _?_ of that relationship.
 a. center, spread, and shape
 b. direction, shape, and location
 c. location, value, and shape
 d. shape, strength, and direction
 e. source, direction, and value
3. Estimate the slope of a spaghetti line that might appropriately fit the data plotted in Question 2. Note the axes scale values.
 a. 4.00
 b. 0.20
 c. 0.75
 d. 1.25
 e. 5.00
4. True or false: A leastsquares linear regression line maximizes the sum of the squared residual values.
 a. True
 b. False
The following data is used for problems 5 through 7.
Table 1: Comparison of Per Capita Ice Cream Consumption and Price of Ice Cream (Thirty 4Week Periods) Price per pint, in dollars,
for Ice Cream (x)Ice Cream Consumption
in Pints Per Capita (y)
 .270
 .282
 .277
 .280
 .272
 .262
 .275
 .267
 .265
 .277
 .282
 .270
 .272
 .287
 .277
 .287
 .280
 .277
 .277
 .277
 .292
 .287
 .277
 .285
 .282
 .265
 .265
 .265
 .268
 .260
.386 .374 .393 .425 .406 .344 .327 .288 .269 .256 .286 .298 .329 .318 .381 .381 .470 .443 .386 .342 .319 .307 .284 .326 .309 .359 .376 .416 .437 .548 The data above is from a research study. Ice cream consumption was measured over 30 fourweek periods. One purpose of the study was to determine whether ice cream consumption depended on the price of ice cream. The researcher uses leastsquares linear regression to determine this prediction equation:
y = 2.047x + 0.9230. 5. Which one of the following statements is least correct?
 a. The prediction equation assures there is a strong negative relationship between price and consumption of ice cream.
 b. The prediction equation can be used to predict general patterns in ice cream consumption based on ice cream price.
 c. The prediction equation indicates that if ice cream was available at no cost ($0), the per capita consumption would be just less than 1 pint.
 d. The prediction equation indicates that there is a negative relationship between price and consumption of ice cream.
6. Which one of the following statements about the prediction line y = 2.047x + 0.9230 is most correct?
 a. The linear equation can be used to state the actual consumption of ice cream when we know the price of ice cream.
 b. The linear equation generates predictions for ice cream consumption where the sum of the squared residuals are minimized.
 c. The linear equation predicts that ice cream priced at $0.400 (40 cents) per pint will yield a per capita consumption of about onefifth of a pint.
 d. The linear equation suggests that when per capita consumption is at 0.300 pints, the price of ice cream will be about $0.403 (40.3 cents) per pint.
7. Which statement best interprets the meaning of the slope of the prediction equation?
 a. For a $1 increase in the price of a pint of ice cream, we can estimate a per capita ice cream consumption increase of approximately $2.05.
 b. For a $1 increase in the price of a pint of ice cream, we can estimate a per capita ice cream consumption decrease of approximately 2.05 pints.
 c. For ice cream priced at $1 per pint, we can estimate that the per capita ice cream consumption will decrease by 0.9230 pints.
 d. For 1 pint of ice cream consumed per capita, we can estimate its price will be approximately $2.05.
8. True or false: For any twovariable data set, the calculator's cubic regression model will always generate a smaller SSE than will the calculator's linear regression model.
 a. True
 b. False
9 . A medianmedian line is to be generated from a scatter plot of data. Given the scatter plot, what is the first step in creating the medianmedian line?
 a. Calculate the slope of the medianmedian line.
 b. Determine the ordered pairs corresponding to the summary points.
 c. Draw an ellipse around the points of the scatter plot.
 d. Partition the data into three groups with, if possible, an equal number of points in each group.
 e. None of the statements (a) through (d) correctly identify the first step.
 f. More than one of the steps in statements (a) through (d) could be completed first.
The following situation is used for problems 10 through 14.
A horticulturist gathered the data shown to the right. We are interested in the relationship that may exist between tree age (x) and tree diameter (y).
It can be shown that the equation for the median median line that models this data set is y = 0.18148x + 0.967901. Also, the equation of the leastsquares linear regression line for this data is y = 0.16065x + 1.465806. The sum of the squared residuals for the leastsquares linear regression line is 25.87845.
Tree Age and Diameter This table lists the ages and diameters of 27 chestnut oak trees planted on a poor site.
Age in Years (x) Diameter in Inches (y)
4 5 8 8 8 10 10 12 13 14 16 18 20 22 23 25 28 29 30 30 33 34 35 38 38 41 42
1.0 1.2 1.3 2.3 3.3 2.4 3.8 5.1 3.8 2.7 4.7 4.9 5.8 6.1 5.0 6.8 6.2 4.8 6.2 7.3 8.2 6.8 7.3 5.2 7.3 7.7 7.8 This data is adapted from Elements of Forest Mensuration (1936) by Chapman & Demeritttree. The diameters were measured 48" off the ground. 10. True or false: The slope of the leastsquares regression line is smaller than the slope of the medianmedian line.
 a. True.
 b. False.
 c. It cannot be determined from the information provided.
11. The medianmedian line model predicts that a 32year old tree from the research location will have a diameter of _?_ inches.
 a. 5.141
 b. 5.807
 c. 6.606
 d. 6.775
 e. None of these are correct.
12. Which statement below is the most meaningful interpretation of the slope of the leastsquares regression line?
 a. A tree's diameter is estimated to increase by 1.465806 inches for every 1year increase in the tree's age.
 b. A tree's diameter is estimated to increase by 0.16065 inches for every 1year increase in the age of the tree.
 c. For every 1inch increase in a tree's diameter, its age is estimated to have increased by 0.16065 years.
 d. For every 1inch increase in a tree's diameter, its age is estimated to have increased by 1.465806 years.
13. Which statement below is the most correct interpretation of the yintercept of the medianmedian line?
 a. A tree's diameter will be 0.967901 inches when the tree is 0.18148 years old.
 b. A tree's diameter will increase by 0.967901 inches during the tree's first year of life.
 c. The medianmedian line cuts through the xaxis at 0.967901.
 d. The medianmedian line yintercept indicates that a tree has a diameter of 0.967901 inches when the tree is 0 years old.
14. Which statement below is the most meaningful interpretation of the sum of the squared residuals (SSE) for the leastsquares regression line?
 a. Because the SSE is greater than 25, this straightline model is of no use for predicting tree diameter (y).
 b. Because the SSE is so small, the leastsquares regression line will be the best model for these data.
 c. No other linear model fit to these data will produce a smaller sum of the squared residuals.
 d. Only the medianmedian line model will produce a smaller SSE.
15. A stemandleaf plot _?_.
 a. can have only oneline stems
 b. cannot be used with values expressed as decimal fractions
 c. does not preserve the values of a data set
 d. is not a visual summary of the data
16. Among the following statistics, which one is most likely being used to support the following statement:
"The firstplace score of 178 is clearly an outlier among all scores for this event"?  a. correlation coefficient
 b. 5number summary
 c. mean
 d. mode
 e. SSE
17. Which of the following orders correctly represents the measures of central tendency for the distribution shown here?

18. In the box plot, where is the 75th percentile located?
19. Which quartile in the data set exhibits the most spread?
20. What value in the data set is the smallest value inside the lower inner fence?
21. Although not shown in the plot, which of the following values would be considered an outlier in this data set?
 Suppose that the ages in years of professional golfers on the PGA Tour forms a moundshaped (normal) distribution with a mean of 34 years and a standard deviation of 6 years.
22. Ages ranging from 34 years to 40 years represent approximately what portion of all the ages in this distribution?
23. Determine an age that will be exceeded by approximately 97.5% of all ages in the distribution.
Choose the representation below that captures the same linear relationship.25. Determine the upper inner fence for this data set: 1,2,4,4,4,7,8,8,9,9,9,9,9,11,15,15,20
 a. 5
 b. 6
 c. 10
 d. 19
 e. 20
 f. None of these.
 Part II: Open Response
Complete each question and write your response in the space provided.
26. Here are weightloss results for nine people that are part of a lowcarb diet program.






a. On the grid provided above, create a scatter plot for these data. Represent weeks on the diet on the horizontal axis (x) and weight loss in pounds on the vertical axis (y).b. Suppose that we want to create a medianmedian line for these data. Begin that process by finding the three summary points used to help determine the equation of the line. DO NOT go beyond this step! You must identify these ordered pairs on the scatter plot and also list them here.
(x1,y1) =
(x2,y2) =
(x3,y3) =
c. When we complete the process of determining the medianmedian line of best fit, the equation is y = 2x + (10/3), where x represents number of weeks in the program and y represents total weight loss in pounds.
i) Interpret the value 2 in the medianmedian line equation as it relates to these data.
ii) Interpret the value (10/3) as it relates to these data and the medianmedian line.
iii) Use the medianmedian line to predict the weight loss for a participant who has been in the program for 11 weeks.
d. Enter these data into your calculator and create the leastsquares linear regression equation, where weeks on the diet is the independent variable (x) and weight loss in pounds is the dependent variable (y).
e. For these data, calculate the difference between the SSEs for the medianmedian line and the leastsquares line.
27. The questions here are to be used with the planetary data shown below. Begin by carefully entering the data into your calculator.
This data shows the distance and orbital period of each planet in our solar system. Note that planet Earth is the third entry in the table.
Distance from the Sun in Astronomical Units (x) Orbital Period in Years (y)
0.386 0.720 1.00 1.52 5.19 9.53 19.2 30.0 39.5
0.241 0.615 1.00 1.88 11.9 29.4 83.8 164 248 a. Use your calculator to generate the leastsquares linear regression line for these data. Write each equation in the form y = mx + b, where x is the distance from the sun and y is the orbital period.
b. State the correlation coefficient for the leastsquares linear regression line.
c. Calculate the SSE for the leastsquares linear regression line.
d. Calculate and plot the residuals for the leastsquares linear regression line. Sketch the residual plot here. On the graph, include indication of the scale you are using.

e. Explore at least two other models to better represent these data as compared to the leastsquares linear regression line. Report the results of your exploration with respect to the criteria we have identified for judging goodness of fit.