A.

A student group sells donuts at the mall. On recent
Saturdays, they've been recording the number of donuts
sold, along with the selling price. Here's the data:
A.1. On the grid provided, create a scatter
plot for these data. Represent sale price in
cents on the horizontal axis (x) and number of
donuts sold on the vertical axis (y). Clearly
indicate how you have scaled each axis.


A.2. Explain whether either the table of
values or your scatter plot reveal
a relationship between donut sale price and
the number of donuts sold.
Both the
tabel of values and the scatter plot reveal a
strong negative relationship between donut sale
price and the number of donuts sold. As the
price increases, the number sold decreases. The
scatter plot shows an apparent linear
relationship.

A.3. Leastsquares linear regression is
applied to the donut sales data set, resulting
in the equation
y = 36.583x + 2121.10,
where x represents donut sale price in
cents and y represents the number of donuts
sold. A correlation coefficient of
r = &endash;0.9995 is computed with
this leastsquares linear regression
equation.
A.3.a. What is the slope of the
regression equation? Describe its
meaning in the context of this data
set. Be specific.
The
slope is 36.583. This represents the
rate of change of the number of donuts
sold and price of donuts. For every 1
cent increase in price (i.e., x
increases by 1), the number of donuts
sold decreases by 36 or 37 donuts
(i.e., y decreases by
36.583).

A.3.b. State two reasons you might
question or doubt the meaningfulness of
the y&endash;intercept of the
regression equation. Be specific.
The
yintercept is 2121.1. As an ordered
pair in the context of the problem,
this represents that at a price of 0
cents (apparently donuts are given
away), the number of donuts sold (given
away) is about 2121.
The
meaningfulness of the yintercepts
could be doubted or questioned for
several reasons. Two primary reasons
are that this ordered pair represents
an extrapolation beyond the existing
data. That is, the apparent linear
pattern exhibited by the data is
assumed to continue outside the bounds
of the data set, all the way to a price
of 0 cents for the donuts. This linear
pattern may not extend in this
manner.
Also,
giving donuts away could be questioned,
for at least two reasons. Is it
reasonable to suggest that the group
would give donuts away, thereby
apparently eliminating any income?
Also, in connection with
exptrapolation, if indeed the group did
give away donuts, there may be far more
or far fewer donuts than 2121 given
away.

A.3.c. Use the linear regression
equation to predict the number of
donuts sold when the sale price is 60
cents.
When
x=60 is substituted into the linear
regression equation, a value of 73.88
is returned. This says that at a price
of 60 cents, a negative number of
donuts is sold.
While
the arithmetic of the calculation is
accurate, it is an impossible situation
to sell a negative number of donuts.
Therefore, the prediction would be that
no donuts would be sold at the price of
60 cents. Again this assumes that the
linear relationship inherent in the
regression equation continues outside
the bounds of the data
set.



2, 2, 6: 10 points
total

B.

On the wall at a local pizzeria is a square dart
board, each side 10" long. For $1 a customer can try to
win a pizza by throwing a dart at the board.
The board contains three smaller squares whose centers
are at the center of the board. Any dart landing in the
innermost square, a square with side length 1", earns a
large pizza ($10 value). If a dart sticks in the first
layer outside the innermost square, part of a square of
side length 3", the customer gets a medium pizza
($5 value). For a dart sticking in the next layer,
part of a square with side length 5", the customer gets a
small pizza ($2 value). A dart on any other
portion of the board wins no prize.
B.4. Suppose a dart hits the board at some random
point. What is the probability of winning a mediumsize
pizza?
The area associated
with winning a mediumsize pizza is the 3inch square
less the 1inch square within it. This represents an area
of 8 square inches. The entire board has an area of 100
square inches. Therefore, the desired probability is
8/100 or 2/25.
B.5. If customers played this game many many times,
and we assume that darts always hit the board at some
random location, what is the expected gain or loss per
play, from a customer's standpoint?
Here are the net
gains (losses) associated with the four possible outcomes
of a dart randomly hitting the board, together with the
probabilities of each.
net gain (g)

9

4

1

1

probability p(g)

1/100

8/100

16/100

75/100

The expected gain
(loss) is the sum of the product of each net gain with
its probability.
Expected
gain
= 9(1/100) +
4(8/100) + 1(16/100) + (1)(75/100)
= 18/100 = 18
cents
This says that in
the long run, over the course of many plays, a player can
expect to lose 18 cents per play.
B.6. Assume again that darts always hit the board at
some random location. What is the net gain the pizzeria
can expect if 1000 rounds of this game are played some
weekend? Take into account only the cost to play and the
value of the prizes.
If each player
loses 18 cents per play, the pizzeria must gain 18 cents
per play. Thus, over 1000 plays, the pizzeria can expect
to have a net gain of $180.

3,3,4: 10 points
total

C.

C.7. In the faroff world of Balbion III in the
Mostarth Galaxy, each year every family in the city of
Krameth is given a pet. There are three species of pets
randomly distributed to the Kramethian families. A family
receives a Quark with probability 0.2, a Rorst with
probability 0.3, and a Swimp with
probability 0.5.
Determine the probability that in a threeyear
sequence, a family gets:
We will use Q to
mean a family gets a Quark, R to mean a family gets a
Rorst, and S to mean a family gets a Swimp. Then P(Q) =
0.2, P(R) = 0.3, and P(S) = 0.5.
You may find a tree
diagram helpful for this problem, or some sort of
organized list that shows all possible 3pet arrangements
a family could have. Note that arrangement is important
here, for QRS differs from SRQ in the order the pets were
received by a family over a threeyear
period.
C.7.a. two Rorsts and a Quark
There are three
ways to get two Rorsts and a Quark: RRQ, RQR, and QRR.
Each of these has probability 0.018, determined by the
product (0.2)(0.2)(0.3). The desired probability is then
the sum of the three individual probabilities, or
0.054.
C.7.b. three pets all of the same species
The three
situations are QQQ, RRR, and SSS. the probabilities
associated with these are (0.2)^3, (0.3)^3, and (0.5)^3.
The sum of these values is the probability we seek:
0.16.
C.7.c. at least two Swimps
Here are the ways
for a family to get at least two Swimps, together with
the probability of each:
arrangement

SSQ

SQS

QSS

SSR

SRS

RSS

SSS

probability

0.05

0.05

0.05

0.075

0.075

0.075

0.125

The sum of these
probabilities represents the probability that a family
gets at least two Swimps. Thast value is
0.50.

3,3,4: 10 points
total

D.

Suppose we know that the distribution of the waiting
times (in minutes) for drivers boarding a ferry boat on
Lake Erie is moundshaped and symmetrical, that is, the
waiting times are normally distributed. The mean waiting
time is 16 minutes and the standard deviation is 4
minutes.
D.8. What portion of all drivers will wait 16 minutes
or less?
Because 16 is the
mean of this normal distribution, half the waiting times
will be greater than 16 minutes and half will be less
than 16 minutes. Therefore half the drivers will wait 16
minutes or less.
D.9. What is probability that a driver will wait 20
minutes or more?
The waiting times
up to 16 minutes represent half the data set and the
times from 16 minutes to 20 minutes represent an
additional 34% of the data values. This means that 84% of
drivers wait 20 minutes or less. The remaining drivers,
16%, wait 20 minutes or longer.
D.10. Based on the information given, we know that
approximately 2.5% of all drivers wait at least x
minutes. What value of x makes this true?
This value, 2.5% of
all drivers, represents the 2.5% longest waiting times in
the data set. These data values represent values at and
beyond 2 standard deviations above the mean. This value
is 24 minutes, the desired value of x.

3, 3, 4: 10 points
total

E.

This problem requires you to design and carry out a
simulation. The situation is first described to you and
then several questions are asked related to the
simulation.
My nephew Seth noticed that Kellogg's cereals
offered a set of 3 cartoon characters in its
current cereal selections. One cartoon character is in
each specially marked box of cereal and the cartoon
characters are equally distributed among the cereal
boxes currently coming off the production line. Seth
wondered how many boxes of cereal he'd have to
purchase to get the entire set of cartoon characters.
Design and carry out a simulation to address Seth's
question. Assume that one trial of your simulation
will determine the number of boxes of cereal he must
purchase to get a complete set of 3 cartoon
characters.
The solution
presented here represents an example solution using one
method and the outcomes of specific trials I carried out.
Other solutions will result from different (equivalent)
models and the specific outcomes of your
trials.
E.11. Describe the model you will use to simulate this
situation. In your description:
 E.11.a. Indicate how you will generate random
outcomes.

Random outcomes
will be generated by the random number generator of a
TI83 graphing calculator.

 E.11.b. Specify the decisions you will make based
on the random outcomes.

I will produce
random integers from the set {1,2,3}. Each value
represents one of the three possible cartoon
characters in the boxes. When a 2 shows up as the
random number, that represents Seth getting the second
of three unique cartoon characters.

 E.11.c. Justify that your model accurately
represents the situation.

The three
characters are equally distributed in the boxes, so
each is equally likely to be contained in a box
selected at random off the shelf. The random numbers
from the set {1,2,3} generated by the TI83 are
assumed to be equally likely to occur. The model for
generating random numbers matches the probabilistic
situation within the context of the
problem.
E.12. Show the details of one trial of your
simulation. Include:
 E.12.a. a list of the random outcomes you
generated for one trial,

One trial: 3, 2,
2, 3, 1

 E.12.b. the decisions you made based on the random
outcomes, and

The 3 represents
the third of the three characters, the 2 the second,
the 1 the first. I kept generating random numbers only
until I had at least one of each.

 E.12.c. the number of cereal boxes required to get
a complete set of cartoon characters, based on this
single trial.

 The above
trial meant that it required 5 boxes to generate a
complete set of three unique cartoon
characters.
E.13. Carry out at least 10 trials of this simulation.
Use the results to answer Seth's original question:
How many boxes of cereal will he have to purchase to
get the entire set of cartoon characters?
The 10 trials,
carried out as described here, required the following
numbers of boxes to complete the set: 5, 8, 6, 6, 3, 5,
4, 7, 9, 5. Thus, 58 boxes were required, or
approximately 6 boxes per trial.
I would tell Seth
that he may get lucky and only require 3 boxes, but its
more likely it will require 5 or 6 boxes, and perhaps
even more.
To be more sure of
our results we could run the simulation for 100 trials or
1000 trials and look at the distribution of the required
number of boxes of cereal. Ten trials are not enough to
make confident predictions.

3, 3, 4: 10 points
total

BONUS!
Assume that 2% of the population is on drugs. A test
is 98% accurate in indicating whether or not a person is
on drugs. This means that people on drugs will test
positive* 98% of the time and people not on drugs will
test negative* 98% of the time.
Determine the probability that a person is on drugs
given that the person's test result is positive. Provide
clear and specific evidence to support your response.
If you're
interested in getting to the solution of this problem,
start by giving yourself a population of 100,000 people.
Now divide the group into those using drugs and those not
using drugs. Further divide the groups into those who
test positive and those who test negative for drug use.
You should be able to get a box of data that looks
something like this, with values in each of the four
cells:
100,000 people
in all

positive test
for drug use

negative test
for drug use

those on
drugs

?

?

those not on
drugs

?

?

Next, try to
determine what cells of the table relate to the question.
What portion of those who test positive are on
drugs?
*A positive test means the test results
indicate the person is on drugs; a negative test means
the test results indicate the person is not on
drugs.
