Illinois State University Mathematics Department

MAT 312: Probability and Statistics for Middle School Teachers

Dr. Roger Day (

Semester Exam
Possible Solutions

  • Part I: 30 Multiple Choice Questions (30 points)
  • Part II: 14 Open-Response Questions (30 points)
  • Bonus: 5 points possible
  • Total: 60 points + 5 Bonus Points
  • Impact on Course Grade: 30% of your Semester Grade

Criteria Used to Evaluate Part II Responses

Your responses to these questions will be evaluated for correct and accurate numerical solutions, appropriate and adequate explanations where required or indicated, and overall clarity of your response.

Part I: Multiple Choice

For each question, choose the most correct response and circle that letter at the appropriate spot on the answer sheet.


The manufacturer of a new type of light bulb wants to show that the new bulbs outlast those of a major competitor. The manufacturer tested 30 bulbs and recorded the life span of each. Here are the data. Use these data for questions 1 through 3.

The data are represented in a _?_.

a. box-and-whisker plot
b. line plot
c. scatter plot
d. stem-and-leaf plot
e. vertical plot


What portion of the bulbs tested lasted at least 500 hours?

a. 1/10
b. 1/3
c. 1/2
d. 2/3


Determine the 25th percentile of this data set.

a. 420 hours
b. 480 hours
c. 490 hours
d. 630 hours


The plot here shows the distribution of heights of residents in a Rockford nursing home. The height representing the upper extreme lies in which measurement class?

a. 50-55 inches
b. 55-60 inches
c. 60-65 inches
d. 65-70 inches
e. None of these measurement classes contain the upper extreme.


The following visual representation shows test scores of 48 students in a science course. How many students scored 50 or less on the test?

a. less than 12
b. no more than 12
c. at least 12
d. It cannot be determined from the plot.


In a distribution that is positively skewed, which statement is most likely to be true?

a. The mean and median will be equal.
b. The mean will be greater than the median.
c. The mean will be less than the median.
d. The median will equal 0.


When least-squares linear regression is applied to the data plotted here, what correlation coefficient is most likely? Assume that the axes scales are equal.

a. -1.0
b. -0.68
c. 0.06
d. 0.73
e. 1.0


The time that it takes to drive from the Interstate Center in Bloomington to the Peoria Civic Center on a Saturday is normally distributed with a mean of 54 minutes and a standard deviation of 7 minutes.

Driving times of no more than 54 minutes represent approximately what portion of all the driving times for this situation?

a. 2.5%
b. 13.5%
c. 34%
d. 47.5%
e. 50%
f. 68%


Which visual representation preserves the values in a data set?

a. box-and-whiskers plot
b. 5-number summary
c. line plot
d. stem-and-leaf plot
e. More than one of the visual representations listed here preserve the values in a data set.
f. None of the visual representations listed here preserve the values in a data set.

For each question 10 through 12, select one of the following data types to best describe each variable.

a. interval data
b. ordinal data
c. ratio data
d. nominal data


the colors of the background pages in a family scrapbook

d. nominal data


the size of each family scrapbook, categorized as small, medium, large, or supersrcap

b. ordinal data


the size of each photo in a family scrapbook, expressed in square inches

c. ratio data


Suppose that the distribution of the life spans of a certain dog breed has a symmetric, mound-shaped (normal) distribution with a mean of 9.5 years and a standard deviation of 3.5 years. Within what bounds do we expect approximately 2.5% of the life spans of such dogs to fall?

a. 0 to 2.5 years
b. 2.5 to 6 years
c. 6 to 9.5 years
d. 9.5 to 13 years
e. 13 to 16.5 years
f. greater than or equal to 16.5 years

Here are the theoretical probabilities for an experiment whose sample space is {0,10,20,30,40,50}. Use this for questions 14 through 16.



Determine P(x is less than or equal to 30).

a. 0
b. 1/8
c. 1/4
d. 1/2
e. 1


Determine the expected value of this experiment.

a. 0.16
b. 1.0
c. 25
d. 41.55
e. 41.75
f. None of these values are correct.


Suppose the experiment was carried out 1000 times and a histogram of the results was created. The histogram would most likely appear _?_.

a. bimodal
b. negatively skewed
c. positively skewed
d. symmetric
e. uniform


For questions 17 through 19, use the following data set: 1,1,2,5,5,6,8

6. Determine the highspread of the data.

a. 1
b. 3
c. 4
d. 5
e. 8


Determine the lower outer fence of the data.
a. -21
b. -14
c. 14
d. 15
e. 21


Determine the sum of the squared deviations from the mean for these data.
a. 0
b. 6.63
c. 16
d. 44
e. 50


Choose the one best response to the following statement:
For a two-variable data set, the calculator's median-median line model will generate a larger SSE than will the calculator's least-squares linear regression model.
a. Always True
b. Never True
c. Sometimes True


Which of the following is not a necessary step in creating a median-median line from a two-variable data set?
a. Calculate the slope of the median-median line.
b. Determine the ordered pairs corresponding to the summary points.
c. Draw an ellipse around the points of a scatter plot of the data.
d. Partition the data into three groups with, if possible, an equal number of points in each group.
e. None of the statements (a) through (d) correctly identify necessary steps.
f. All of the statements (a) through (d) identify necessary steps.


Which of the following correctly complete the statement?
A box-and-whiskers plot _?_.
I. does not preserve the values of a data set
II. is a visual summary of a data set
III. shows the median value in a data set
a. choice I only
b. choice II only
c. choice III only
d. choices I and II only
e. choices I, II, and III


Among the following statistics, which one is most likely being used to support the following statement:
"There is strong indication that a student's writing test score is closely associated with that student's mathematics test score."?
a. correlation coefficient
b. 5-number summary
c. mean
d. median
e. mode


For questions 24 and 25, consider the following probability distribution for some experiment.

Sample space: {1,3,5,7,9}

P(1) = 0.15, P(3) = 0.30, P(5) = 0.10, P(7) = 0.40, P(9) = 0.05

Choose the one best response to the following statement:

This experiment illustrates a symmetric distribution.

a. False.
b. True.
c. It is impossible to determine whether this is a symmetric distribution.


Let F be the event "5 does not occur." What is the probability of the complement of F?

a. 0.10
b. 0.25
c. 0.50
d. 0.70
e. 0.90
f. 0.95


Which statement below is most correct regarding the following distributions?

Distribution I
Distribution II
Distribution III
a. Distribution I is the only valid probability distribution.
b. Distribution II is the only valid probability distribution.
c. Distribution III is the only valid probability distribution.
d. None of these are valid probability distributions.
e. More than one of these are valid probability distributions.


Which of the following is the least justifiable criterion for positioning a spaghetti line on a scatter plot to represent the relationship between two variables?

a. Position the line so that all the points are above the line.
b. Position the line so that it passes through as many data points as possible.
c. Position the line to account for the real-world context.
d. Position the line to keep the points as close to the line as possible.


A data set of 125 ordered pairs relates age of a car, in years (a), to its resale value, in dollars (R). For example, (3, 9250) represents a 3-year-old car with a resale value of $9250. Suppose that a median-median line is calculated for these data and is represented by the equation R = -1256a + 11952. In this equation, the number -1256 represents _?_.

a. a decrease of $1256 in the value of a car over a one-year period
b. an increase of $1256 in the value of a car over a one-year period
c. a slope of 1256
d. the value of a car after 3 years
e. the vertical-axis intercept of the median-median line


Choose the one best response to the following statement:

A fair die is rolled and the number on the face-up side is recorded, so the sample space is {1,2,3,4,5,6}. Each outcome in the sample space is equally likely.

a. False.
b. True.
c. It is impossible to determine from the information provided.


Choose the one best response to the following statement:

If two events are complementary, then they are mutually exclusive. 

a. Always True
b. Never True
c. Sometimes True

Part II: Open Response

Complete each question and write your response in the space provided.


On the wall at a local pizzeria is a rectangular dart board, similar to the one shown below. The board is composed of four rectangles centered upon each other. Here are the dimensions of the rectangle (Note: Figure not drawn to scale.):

  • Retangle #1: 20 inches by 12 inches
  • Rectangle #2: 14 inches by 10 inches
  • Rectangle #3: 10 inches by 6 inches
  • Rectangle #4: 6 inches by 3 inches

For $5 a customer can try to win a pizza by throwing a dart at the board.

Any dart landing in Rectangle #4 earns a family-size pizza ($18 value). If a dart sticks in Rectangle #3 (and not within #4) the customer gets a large pizza ($12 value). For a dart sticking in Rectangle #2 (and not within #3 nor within #4) the customer gets a medium pizza ($8 value). A dart on any other portion of the board (in Rectangle #1 and not within Rectangle #2 nor #3 nor #4) wins no prize.

31. Suppose a dart hits the board at some random point. What is the probability of winning a medium pizza?

Some Useful Information: Areas of Rectangles

  • Rectangle #1: 240 square inches
  • Rectangle #2: 140 square inches
  • Rectangle #3: 60 square inches
  • Rectangle #4: 18 square inches

To get a medium pizza, dart must be in #2 but not in #3. This region has area 80 square inches (140-60). We compare this region to the area of entire board, just the area of Rectangle #1, 240 square inches.

This generates a probability of winning a medium pizza as 80/240 or 1/3.

32. Let w represent a random variable that represents the prize values possible on a toss of a dart, assuming that darts always hit the board at some random location. Create a table to show the probability distribution for the random variable w.

To generate the desired probabilities, we proceed as illustrated in question 31 above. 
100/240 = 5/12
80/240 = 1/3
42/240 = 7/40
18/240 = 3/40

33. Calculate the expected value of w. Explain what this represents for a pizzeria customer.

E(w) = $0*(5/12) + $8*(1/3) + $12*(7/40) + $18*(3/40) = $6.12 (rounded)

This indicates that a $5 investment yields a $6.12 outcome, or a "profit" of $1.12. Note the assumption, however, that every thrown dart hits the board in a random location. You might be a better thrower, or, alternatively, you may not!

7 points total


34. The prom supervisor at a local high school asked for volunteers for next year's prom committee. There were seven 10th-grade and five 9th-grade volunteers. The advisor only needed three from each of the two classes. How many 6-person prom committees could be formed under these conditions? Explain your response.

C(7,3)*C(5,3) = 350 different committees 

Arrangement within the committee is not a factor, so choose 3 of the 7 10th graders and 3 of the 5 9th graders. 

35. The design of a quilt for a newborn baby is shown here, composed of three parts: an inset, a body, and a border. Suppose each of the three parts of the quilt is to be a solid color, with each part a different color. If there are only seven colors to choose from for the quilt, how many different quilts could be made? Explain your response.

P(7,3) = 210 different quilts can be made.

Here, arrangement matters, so we select 1 of 7 colors for the border, 1 of the remaining 6 for the body, and 1 of the remaining 5 for the inset. 

 36. Junior Samples was left in charge of the hat-check room at a recent square dance. Against one wall in the hat-check room was a rectangular array of numbered cubicles into which hats could be placed. A big sign hung over the cubicles. Here's what it said:

Hat-Check Policy:
No more than three hats in each cubicle!
During the night, Junior kept a tally sheet showing the time on the clock and the number of hats that were in check at that time. A portion of the tally sheet is shown below.
After the dance, Wally Woundhouse, Junior's supervisor, sauntered over to the hat-check room and surveyed Junior's tally sheet. He looked at the 9:05 entry and said, "Junior, if your figures are accurate, at precisely 9:05 I know that you violated the hat-check policy! You may have violated it some other time not shown here, but I can assure you that you did at 9:05!"

Based on Wally's statement, how many cubicles were there in the hat-check room? Explain how you know.

Because the only violation noted was at 9:05, it must be the case, by the pigeonhole principle, that 150 hats was the maximum allowed and therefore that there are 50 cubicles in the hat-check area. 

7 points total


Use the following sets for questions 37 through 40:

Set I: {a,c,g,h,i,l,o,r,y}
Set II: {e,i,n,o,q,s,t,u}
Set III: {a,d,e,m,o,w}
Set IV: {a,e,i,o,u,y}

37. A letter is to be chosen from Set I or from Set II. How many choices are there? Explain.

9 + 8 - 2 = 15 choices

By the addition principle, we can choose one of 9 from the first set or one of 8 from the second set. However, there are two elements common to the two sets, so we must subtract these.

38. Two-letter "words" (meaningful or otherwise) are to be created using a letter from Set I as the first letter of the word and a letter from Set III as the second letter of the word. How many different two-letter words can be created in this manner? Explain your response.

9*6 = 54  words

Using the multplication principle, there are 9 ways to fill the first position and 6 ways to fill the second position. 

39. Three-letter sets are to be created using the letters from Set I with no repetition allowed. For example, {a,c,g} can be created, but not {a,a,a} nor {a,a,g}. Note, also, that the set {a,c,g} is equivalent to the set {g,c,a}. How many unique three-letter sets can be created? Explain your response.

C(9,3) = 84 sets 

With no repetition allowed and order being non-significant, we grab 3 of the 9 letters from Set I. 

40. How many unique new sets can the created from the letters in Set IV, under the following conditions? Explain your response.

(i) A new set can contain as few as one letter to as many as six letters.
(ii) A new set can contain one or more letters from Set IV without repetition.
(iii) The arrangement of the letters in a new set is not significant. For instance, the new set {e,i} is no different than the new set {i,e}.

2^6 - 1 = 63 new sets

Except for the new set with no letters in it, we want all subsets that can be created.

8 points total


41. Use your calculator to generate the least-squares linear regression line for the data shown here. Write each equation in the form y = mx + b, where x is the position of the note and y is the frequency.

y = 36.461538461539x + 381.53846153846 

 42. State the correlation coefficient for the least-squares linear regression line that models these data.

r = 0.9953672629 

43. Calculate the SSE for the least-squares linear regression line that models these data.

SSE = 2257.538462

Note Frequencies

Here are the frequencies of musical notes of the scale spanning one octave starting at A = 440 Hz.

Position of the Note
Frequency (Hz)

44. Explore at least two non-linear models to better represent these data as compared to the least-squares linear regression line. Report the results of your exploration with respect to the criteria we have identified for judging goodness of fit.

 Here is goodness-of-fit information for most of the potential models available using your calculator's regression capabilities. Your report should include identification of at least two models beyond the linear and discuss the various components of each model that comprise goodness of fit.

Name/Type of Model
correlation coefficient (r)
least-squares linear regression
pattern (quadratic?)
median-median line
r not calculated
pattern (quadratic?)
quadratic regression
pattern (cubic?)
cubic regression
quartic regression
exponential regression
logrithmic regression
power regression
logistical regression
not calculated
sinusoidal regression
not calculated











8 points total


Juanita is a political satirist. She claims to know enough jokes today so that she could tell a different set of three jokes in her warm-up act, every night of the year, for at least 40 years. What is the minimum number of jokes she must know?

NOTE: The set of jokes {A,B,C} is considered one set of jokes, no matter what order Juanita tells the three jokes.

There are 365 days per year during non-leap-year years. Through 40 years, this amounts to 14600 days. During this time, there can be, at most, 11 leap years. So, all tolled we must account for 14611 days.

Juanita, therefore, must have a pool of jokes from which she can grab 3 at a time and never repeat the same set of 3 during 14611 days of joke telling.

This reduces to determining a value for j such that C(j,3) is greater than or equal to 14611. By guess and test, we get that

C(45,3) = 14190

and that

C(46,3) = 15180.

This indicates that 46 jokes is the minimum number in Juanita's collection.