Illinois State University Mathematics Department

 MAT 312: Probability and Statistics for Middle School Teachers Spring 1999 9:35 - 10:50 am TR STV 350A Dr. Roger Day (day@ilstu.edu)

 Possible Solutions to Problem Set #3 .

A. Car maintenance is not my favorite activity. Even the mundane chore of changing the oil is a pain. I usually wait longer than recommended to change the oil in my car. The owner of a local service station wanted to know whether other drivers had such bad habits, so he conducted a survey of his records to determine the length of time (in months) between customer oil changes. He randomly sampled station records and found the following 15 times (in months) between oil changes:

 6 6 24 8 6 6 16 6 12 18 8 4 12 12 6
 1. Compute the sample mean and the sample standard deviation for this data. The sample mean is 10 months and the sample standard deviation is 5.6569 months. 2. Determine the 5-number summary for this data. The 5-number summary is 4/6/8/12/24 months. 3. Suppose the data value 24 months was incorrectly copied from station records. It should have been 4 months. Which measure of central tendency, the mean or the median, will be most affected by this change? Explain. With 24 months in the data set, the mean is 10 and the median is 8. With 24 corrected to 4, the mean changes to 8.67 months and the median changes to 6 months. Thus, we see the mean dropping by 1.33 months and the median dropping by 2 months. These calculations show that the median is more affected by the error correction. This seems contrary to what we have stated about the mean being more influenced by extreme values. Because of the make-up of the data set, however, our results are atypical. The principle factors are (1) the change caused by the correction, a decrease of 20 months, is spread out over the 15 values (hence the net change in the mean of 1 and one-third months: 20/15); (2) the change from 24 months to 4 months was a move from one extreme to another, so the physical middle of the data set did change, and by the nature of the data set, the placement of the physical middle changed from 8 months to 6 months.

B. The table here shows cigarette consumption per adult per year and the number of deaths per 100,000 people per year from coronary heart disease (CHD) for 21 developed countries.

4. Draw a scatter plot of the data, using cigarette consumption on the horizontal axis.

5. Use the median-median line technique to determine a regression line for the data. Follow these steps.

 a) Show the scatter plot partitioned into three sets of seven points each. b) On your scatter plot, identify the median median point in each of the three sections. c) Calculate and state the slope of the line passing through the first and third median median points. The slope is found using points M1 and M3 shown in the scatter plot above: (212-114)/(3220-1410)=98/1810=49/905 (approximately 0.05414) d) Identify the ordered pair that is the point in the middle section through which the regression line should pass. Describe how you determined this point. An equation for the line containing points M1 and M3 is y-114=(49/905)(x-1410). When x=1810, the value of x for the ordered pair M2, y=135+(119/181). The actual value of y for M2 is 125, so there is a residual value of 10+(119/181) between these two y values. We take one-third of this residual and adjust the y value of M2 by that amount. This amounts to an adjustment of 3+(100/181), and we subtract that from y=135+(119/181), to move closer to y=125. This makes the y value be 132+(19/181). Therefore, the desired ordered pair in the middle section through which the median-median line must pass is (1810,132+(19/181)). e) Write the equation of this median-median line. Using the slope calculated in (c) and the point determined in (d), the line is y=(49/905)x+34+19/181. This is approximated by y=0.05414x+34.10497.

6. Use your median-median regression line equation (question 5e) to determine:

 a) the predicted number of deaths from CHD in a country where the adult consumption of cigarettes is 3000 cigarettes per year. Using the equation in (5e), for 3000 cigarettes consumed per year, the predicted death rate from CHD, for adults aged 35-64, is between 196 and 197 deaths per 100,000. b) the predicted number of cigarettes consumed per year by adults in a country where the CHD mortality rate is 150 deaths per 100,000 people. Again using the equation in (5e), for a country with 150 CHD deaths per 100,000 adults aged 35-64, the per capita cigarette consumption is between 2140 and 2141.

7. Describe the meaning of the slope of the median-median line. What does it represent relative to the situation described?

The slope of (49/905) suggests that when per capita cigarette consumption increases by 905, the CHD death rate per 100,000 adults will increase by 49.

8. Describe the meaning of the vertical-axis intercept of the median-median line. What does it represent relative to the situation described? Is it reasonable and appropriate? Explain.

The vertical-axis intercept, just over 34, corresponds to 34 CHD deaths per 100,000 adults when no cigarettes are smoked. Although this would be an extrapolation outside the range of the data, it is a real possibility, in that there are likely to be other causes of CHD so that despite smoking no cigarettes, people will die of CHD.

One may raise the question of whether it's possible that a country would have a per capita rate of 0 cigarettes!

9. Use your calculator to determine the least-squares regression line for this data.

The TI-83 calculates the least-squares linear regression to be approximately y=0.0602x +15.6415.

10. On your scatter plot (or a copy of it), plot the regression lines determined in (5) and (9). Compare the effectiveness of the lines for prediction purposes. Which one is more effective? Justify your response.

 The scatter plot here shows the median-median line in red and the least-squares lienar regression line in blue. As you can see, the two lines are quite similar. By examining output values for various input values, we can see that similarity numerically as well. For 1000 cigarettes consumed per capita, the MM line predicts 88 CHD deaths per 100,000 while the LS line predicts 76 deaths. At 2000 cigarettes, the MM line predicts 142 deaths and the LS line predicts 136. At 3000 cigarettes, the MM line predicts 197 deaths and the LS predicts 196. Finally, at 4000 cigarettes, the MM line predicts 251 deaths and the LS line predicts 256 deaths. All in all, these are very close.

 C. Counting Problems

Questions 11-13: All automobile license plates in Minnesota have six characters. The characters can be letters of the alphabet (A, B, . . . , Z) or single-digit numbers (0, 1, . . . , 9).

 11. If a license plate has four letters followed by two digits, with repetition allowed, how many different license plates are possible? There are 26 choices for each letter and 10 choices for each digit. By the mutliplication principle, we have 26*26*26*26*10*10=(26^4)*(10^2) different license plates. 12. If a license plate again has four letters followed by two digits, this time with repetition not allowed, how many different license plates are possible? Now we have 26 choices for the first letter, 25 for the next, followed by 24 and 23 choices. There are 10 choices for the first digit and 9 for the second. This yields 26*25*24*23*10*9 different license plates. 13. If a license plate has three letters followed by three digits, with no repetitions allowed and no use of either the letter O or the digit 0, how many different license plates are possible? We start now with 25 letters and 9 digits available, and repetition is not allowed. This gives us 25*24*23*9*8*7 possible plates.

Questions 14-15: From a class of 20 students, a committee of 3 is to be formed to organize a fund raiser.

 14. If all 20 students are eligible for committee membership, how many different committees could be formed? Because we are not concerned about any specific committee offfices being filled nor about the order that committee members are chosen, this is a selection problem and we use a combination to express the number of possiblilities: C(20,3). 15. If there are 12 women and 8 men in the class, and the committee requires at least one member of each gender, how many different committees could be formed? There are now two ways to form a committee: two women and one man or one woman and two men. The first can be done in C(12,2)*C(8,1) ways, and the second in C(12,1)*C(8,2) ways. Because those two ways have nothing in common (we can't have a committee with two women and one men be the same committee as one with two men and one women), we add the two possibilities to get a total of C(12,2)*C(8,1) + C(12,1)*C(8,2) ways to create the desired committee. You should convince yourself that the following answer is WRONG: C(12,1)*C(8,1)*C(18,1), where INCORRECT REASONING is to choose one woman from 12, choose one man from 8, and choose one more from the remaining 18.

Questions 16-21: Use Set I: {d,g,o}, Set II: {c,e,h,i,k,n}, Set III: {a,r,s,t}

 16. One letter is to be chosen from Set I or Set II or Set III. How many possible choices exist? There are no duplications among the elements in the three sets, so we add the number of elements in each set: 3+6+4-13 choices. 17. A 3-letter set is to be created consisting of one letter from each of Sets I, II, and III. How many such 3-letter sets are possible? We choose one letter from each set, which can be done in 3*6*4=72 ways. 18. How many 6-letter arrangements are possible using the letters in Set II? This is a permutation of the letters in Set II. There are P(6,6)=6! such arrangements. 19. How many 2-letter sets can be made from the letters in Set III? This is a selection not an arrangement, so we use combinations: C(4,2)=6 such sets that can be made. 20. How many 5-letter arrangements can be made such that the first and last letters are from Set I and the other three letters are from Set III? We choose and arrange 2 of the three letters from Set I (done in P(3,2)ways) and we choose and arrange 3 of 4 letters from Set III (done in P(4,3) ways). This results in P(3,2)*P(4,3)=12 possible 5-letter arrangements. 21. How many arrangements of the letters in Set II can be made so that no vowels are adjacent to each other? In Set II there are two vowels and four consonants. We use the consonants as "fences" to separate the vowels. We do this do first arranging the four consonants. this can be done in P(4,4)=4! ways. With the consonants arranged, there now are five "spaces" created among those consonants, including at the ends of the line up. We therefore have 5 spaces to choose from in placing the first vowel and 4 spaces remaining into which we place the second vowel. thus, the vowels can be placed in 5*4 ways. Together, there are 4!*5*4=480 arrangements.

Questions 22-24: A postal worker entered Complex A of Reeseman Apartment Marketway and approached the row of mail boxes in the hallway. In a slot on each box was a label showing the name of the apartment dweller and a two-digit number for the apartment, as shown below.

 J. Hawks Apt 10 M. & M. Sweets Apt 12 U. R. Bigg Apt 14 T. Off Apt 16 S. Teemed Apt 18 Y. R. U. Heer Apt 20

 22. How many letters must the postal worker have to assure that at least one of the apartment dwellers would be delivered at least two letters? The worst case is that the first 6 letters all go to different mail boxes. The seventh one, however, would most certainly assure that someone among the six apartment dwellers gets at least two letters. Therefore, 7 letters are required. 23. What is the least number of letters the postal worker must have to assure that at least one of the apartment dwellers will be delivered more letters that his or her apartment number? The worst case is that each apartment dweller gets exactly the number of letters equal to his or her apartment numbers. This would require 90 letters (10+12+14+16+18+20). The next letter, however, would achieve the desired result. Therefore, 91 letters are required. 24. Suppose that in Building K of the apartment complex there are n apartments, numbered with consecutive even numbers 2 through 2n. How many letters must the letter carrier have to assure that at least one of the apartment dwellers in Building K would be delivered at least m letters? The worst case is that is that every one of the n apartments gets (m-1) letters. The next letter would assure us of the desired condition. We would need n(m-1)+1 letters.

Question 25: A fast-food chain advertises that it serves hamburgers in more than 1000 ways. The chain offers its burgers with various combinations of mustard, catsup, mayonnaise, Cajun spices, pickles, lettuce, cheese, and type of bun.

25. Is the chain's advertising claim legitimate? Explain.

In this discussion, I assume the following possibilities, which may differ slightly from yours.

For each of the offerings described above, there are a certain number of choices. most often, there are two choices, either to have an item on the sandwich or to not have it. I assume this is the case for mustard, catsup, mayonnaise, Cajun spices, pickles, and lettuce. So for each of these 6 items there are two choices. Together, then, there are 2*2*2*2*2*2=2^6=64 ways to create a sandwich with or without these 6 items.

There may be other options for cheeses and type of bun. For instance, there may be three kinds of cheese offered, and, of course, you can still have the cheese withheld. For ther bun, I assume it is not the case that an option is to NOT have a bun, but I assume there are two or more choices for type of bun.

Now, to have more than 1000 ways you can get your burger, we must have 64*c*b>1000, where c represents the number of choices for cheese and b represents the number of types of buns. Dividing 1000 by 64 shows that c*b must be 16 or larger.

Therefore, the following possible pairs (c,b) will work, as will an infinite number of others:

 some possible (c,b) pairs that seem reasonable for this situation (2,8) (3,6) (3,7) (4,4) (4,5) (5,4) (6,3) (7,3) (8,2)

For instance, the first entry indicates there are two choices for cheese (either have the one kind they offer or have no cheese) and 8 types of buns.

So my explanation, given the assumptions described here, shows it is possible and reasonable that the claim is accurate of having more than 1000 ways to get a burger.

 Assigned: Tuesday 16 March 1999 Due: Tuesday 30 March 1999 .