Honors Introduction to Statistics

Practice Questions for Exam 2

 

1.  The Admissions Office has developed a new 10 minute video to send to prospective students to extol the virtues of attending City U.  Before mass-producing the tape, they would like to test whether it is more effective than the current video.  Suppose that we have 12 high school student volunteers who have agreed to take part in an experiment.  The factor to be studied is the video, with two levels, OLD and NEW.

 

(a) Carefully describe an example of a statistical experiment that could be applied to this situation.  Give explicit instructions on what the 12 students should do and be sure to indicate how randomization is used as part of your experiment.

(b) What specific question would you ask to measure a response variable in this experiment?

(c) Would you classify your response variable as categorical or quantitative?

(d) Would you classify the experiment you have described as a randomized comparative experiment, a matched pairs design, or something else?  Explain briefly.  Why is your type of design the best choice?

 


2.  The age distribution of students at City U is modeled by the distribution shown to

the right. 

 

(a)  Approximate the median student age on the graph based on the distribution.

Explain how you made your approximation.

(b)  Do you expect the mean student age to be higher or lower than the median? 

Explain briefly.  Approximate the mean student age, based on the distribution.

(c)  If we took random samples of size 5 from the student population, computed the

average age within the sample, and looked at the distribution of these averages, would

you expect the mean for the new distribution to be larger than, smaller than, or the same

as, the mean you estimated in Part (b)?  Explain briefly.

(d)  If we took random samples of size 5 from the student population, computed the average age within the sample, and looked at the distribution of these averages, would you expect the standard deviation for the new distribution to be larger than, smaller than, or the same as, the standard deviation of the original distribution shown above?  Explain briefly.

 

3.  Despite the difficulties, it is sometimes possible to build a strong case for causation in the absence of experiments.  The evidence that smoking causes lung cancer is about as strong as non-experimental evidence can be.  What criteria are necessary to suggest causation when we cannot do an experiment? 

 

4.  You are interested in determining the level of student support for student government activities.  Create a question that is clearly biased, and one that is (to the extent possible) not biased.  Briefly explain how you expect responses to the two questions to differ.

 

5. A study of education followed a large group of fifth-grade children to see how many years of school they eventually completed.  Let X be the highest year of school that a randomly chosen fifth grader completes.  (Students who go on to college are included in the outcome X = 12.)  The study found the following probability distribution for X.

Years

4

5

6

7

8

9

10

11

12

Probability

0.010

0.007

0.007

0.013

0.032

0.068

0.070

0.041

0.752

 

(a)  Carefully explain how you know this is a legitimate probability distribution.
(b)  What percent of fifth graders eventually finished 12th grade?

(c)  Explain what P(X = 4) = 0.010 means in terms of children completing school.

(d)  Find P(X 6).

(e)-  Find the probability that a randomly chosen 5th grader finishes 12th grade, given that the student finished 9th grade.

 

6.  Generate two random numbers between 0 and 1 and take Y to be their sum.  The sum Y can take any value between 0 and 2.  The density curve looks like a triangle with base from 0 to 2.

(a) Sketch a graph of the density curve.
(b) What is the height of the triangle?  How do you know?

(c) What is the probability that Y is less than 1?  (Shade the area that represents the probability on your density curve, then find that area.)

(d) What is the probability that Y is less than 0.5?  (Again, shade the corresponding area on your density curve.)

 

7.  Tetrahedral dice are shaped like pyramids, with 4 triangular faces, each of which is an equilateral triangle (all sides have the same length).  Assume each die has sides labeled 1, 2, 3 and 4.  When you roll a tetrahedral die, you “roll” the number on the down face.

(a) Give a probability model for rolling two such dice. 

(b) What is the probability the sum of the down-faces is 5?

 

8.  A bottling company uses a filling machine to fill glass bottles with beer.  The bottles are supposed to contain 300 ml.  In fact, contents vary according to a normal distribution with mean  ml and standard deviation ml.

a.  What is the probability that an individual bottle contains less than 295 ml?
b.  What is the probability that the mean contents of the bottles in a six-pack is less than 295 ml?
c.  What important result guarantees the difference between the previous two probabilities?

9.  The carapace lengths (in mm) of 15 mature gopher tortoises randomly selected from the preserve in Abacoa are shown below.

 

320      295      284      303      315      308      303      305     

272      315      291      294      276      318      278

               

a.  Examine these data for shape, center, spread, and outliers. 
b.  We are making three assumptions in our use of inference right now.  List those three assumptions and discuss the degree to which each is or is not met in this situation.

c.  Assuming that the standard deviation of carapace lengths of all mature gopher tortoises in the preserve is s = 16 mm, give a 95% confidence interval for the mean carapace length of all mature gopher tortoises in the preserve.  Write a complete sentence interpreting the meaning of your interval. (Your sentence should say something about tortoises!).

d.  Estimate the sample size you would you need to compute a 95% confidence interval with a margin of error less than 3 mm. 

 

10.  A social psychologist report:  “In our sample, ethnocentrism was significantly higher (P < 0.05) among church attenders than among non-attenders.”  Explain what this means in language understandable to someone who knows no statistics.  Do not use the word “significance” in your answer.

 

11.  A random number generator is supposed to produce random numbers that are uniformly distributed on the interval from 0 to 1.  If this is true, the numbers generated come from a population with mean  and standard deviation .  Unfortunately, producing a good random number generator is quite difficult, and it is well known that many such generators are not particularly random.  You decide to test Excel’s random number generator by generating 100 random numbers between 0 and 1.  You want to perform a hypothesis test to decide if Excel generates truly random numbers by looking at the mean from your sample.

a.  State your hypotheses.
b.  Suppose the mean of the 100 numbers generated by Excel is .  Calculate the value of the test statistic.  Find the p-value for the test.
c.  Is the result significant at the 5% level?  At the 1% level?
d.  What can you conclude (or not conclude) based on your test?  (Your answer should say something about random numbers!)

12.  True or False

 

________  The probability of an event can be described as the proportion of times the event occurs in many repeated trials of a random phenomenon.

________  Two events are independent when they cannot occur together. 

________  If we compute two confidence intervals, an 80% confidence interval and a 90% confidence interval, based on the same sample, the 80% confidence interval will be narrower.

________  The most important assumption in using techniques of inference is that our samples are SRSs.

________  Significance tests can tell us if the observed effect was likely due to chance.

 

13.  Your mail-order company advertises that it ships 90% of its orders within three working days.  You select an SRS of 100 of the 5000 orders received in the past week for an audit.  The audit reveals that 86 of these orders were shipped on time. 

 

a.  Explain why we expect the number of on-time shipments in an SRS of size 100 to obey a binomial distribution.  What are the relevant parameters?

b.  If the company really ships 90% of its orders on time, what is the probability that 86 or fewer in an SRS of 100 orders are shipped on time?  (Use the normal approximation to the binomial distribution.)
c.  A critic says, “You claim 90%, but in your sample the on-time percentage is only 86%.  So the 90% claim is wrong.”  Explain in simple language why your probability calculation in (a) shows that the result of the sample does not refute the 90% claim.

 

14.  Compute the following probabilities, based on a standard deck of 52 cards (no jokers).

a.  Draw one card.  What is the probability you draw a spade?
b.  Draw one card.  What is the probability you draw a jack given that you draw a face card?
c.  Draw two cards.  What is the probability that your second card is a spade, given that the first card you drew was a spade?
d.  Draw two cards.  What is the probability that your second card is a spade, given that the first card you drew was a heart?
e.  Draw two cards.  What is the probability you draw two cards in the same suit?