Wikipedia:Reference desk/Archives/Mathematics/2013 April 7

From Wikipedia, the free encyclopedia
Mathematics desk
< April 6 << Mar | April | May >> April 8 >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


April 7[edit]

Confidence interval[edit]

The textbook Elementary Statistics by Larson and Farber, 2nd ed., states in the chapter on confidence intervals:

"After constructing a confidence interval, it is important that you interpret the results correctly. Consider [the above example]. Because µ already exists, it is either in the interval or not. It is not correct to say 'There is a 90% probability that the actual mean is in the interval (22.35,23,45).' The correct way to interpret your confidence interval is 'There is a 90% probability that the confidence interval you described contains µ.' [...]" [emphasis in original]

There are a number of differences between these two sentences, but I don't see which of these illustrates an important point. Is it the change from "actual mean" to "µ"? If so, then a better place to say so would be when introducing µ. Or is it "is in" vs. "contains"? That seems to be merely a grammatical change. — Sebastian 04:42, 7 April 2013 (UTC)[reply]

I think the key is Because µ already exists. As we know µ and know the interval it either lies in the interval or not, if it does then there is 100% chance, if it does not then there is a 0% chance.--Salix (talk): 05:58, 7 April 2013 (UTC)[reply]
In this case, we don't know µ yet. But if we did - as the IP editor below writes in their thought experiment, this would indeed be a possible distinction. — Sebastian 06:44, 7 April 2013 (UTC)[reply]
I'm not a specialist in statistics, but it seems to me that what the authors are trying to get across is poorly stated. I agree that both sentences appear to say the same thing. (I've taught classes from Larson's algebra and calculus books - though not statistics - and I don't have a great deal of confidence in the accuracy of their mathematical content, particularly on subtler points.) Here is my understanding of what is meant by a confidence interval.
You might say that, for any particular value of the actual mean µ, there is a 90% chance that if you take a sample and construct an interval in the indicated way, then µ will be in the given interval. The 90% probability is viewed as being prior to taking the sample and determining the confidence interval to be (22.35,23.45). This is different from saying, once the sample has been taken and the interval constructed, that there is a 90% probability that µ is in that interval.
For example, imagine that some purely theoretical considerations, or perhaps earlier reliable studies, make it highly unrealistic that µ actually lies in the calculated confidence interval. Even excluding the possibility of a biased sample or other methodological problems, you might conclude in those circumstances that a rare event probably occurred with your sample. (This makes sense mathematically at least. As to whether this is proper scientific methodology, that's another question.) So you can no longer say at that point that there's a 90% chance that µ is in the confidence interval, since outside factors have convinced you this is unlikely. But before taking the sample, you could have said that there would be a 90% chance that µ would eventually lie in the confidence interval, once the sample was taken and the interval calculated.
I believe the distinction I'm drawing here is typical of the objection made by Bayesian statisticians to some statistical techniques, but I'm getting beyond my level of expertise on this. Other than that, it's hard for me to see what the authors might have in mind. 64.140.121.87 (talk) 06:10, 7 April 2013 (UTC)[reply]
Thank you, I think you captured what they meant. It is quite a subtle point though, especially since the book isn't particularly rigorous otherwise. They seem to have no qualms, e.g., (in section 4.3 on discrete probability distributions) to "use the fact that the variance of a Poisson distribution is σ² = µ" to solve the following question:

20. Snowfall. The mean snowfall in January for Evansville, Indiana, is 4.0 inches.
(a) Find the variance and standard deviation. Interpret the results.
(b) Find the probability that the snowfall in January for Evansville, Indiana, will exceed seven inches.

(In this example, the SD would have the dimension √inch, which doesn’t make sense. That this is absurd can be illustrated if one ignores the unit: One would get √4 = 2 — or, if one uses 100 mm instead of 4 inches, √100 = 10.) — Sebastian 06:44, 7 April 2013 (UTC)[reply]
That example has worse problems! Why should the snowfall be a discrete variable? And even if it is, you have to know how small the quanta are in order to apply a discrete analysis. (This is the same issue as your units issue, fundamentally.) It strikes me that it would be fascinating, if the measurements could be made sensitive enough, to measure something like the fundamental electric charge by observing the standard deviation of the total charge of many small boxes snatched out of an ionized, rarefied gas. --Tardis (talk) 03:32, 12 April 2013 (UTC)[reply]

It's because in the frequentist world, you're not supposed to make probability statements about parameters, as they're not random variates - see the first bullet point in that frequentist link. But you can make prob statements about CI's, because they are random variates. I think some statisticians get quite insistent on this point, but I think they're being pointlessly pedantic :-) Also it confuses non-specialists for no benefit - practically there is a 90% chance µ is in the CI. Mmitchell10 (talk) 15:10, 7 April 2013 (UTC)[reply]

64's analysis is quite correct (as is everyone else here). An analogy might help. Imagine you are about to roll a die. Suppose the numbers have worn off slightly, and you are looking at it through a pair of glasses that have been splashed with mud. Before you roll it, you have a 1 in 6 chance of throwing a 2. Afterwards, you look at it, and decide you can't make out whether you are looking at spots on the die, or mud on your glasses. You try to discern the dots, but eventually you give up. Can you say "the chance of a 2 is 1 in 6"? Well, sort of. Because you are looking at the actual die, you would rather say it either is or isn't a 2. There is no longer any real randomness. You just can't see properly. If you could prove beyond doubt that your glasses are caked in mud, and you can see nothing at all, then it is a 1 in 6 chance. But if you can see through the glasses at all, you might be gleaning something from the die. Then you can't talk like that anymore. So it is better to talk more precisely, and talk of confidence intervals as being, a priori, 90% (or 95%) likely to contain the parameter of interest, but not a posteriori, that is, after the data are gathered. IBE (talk) 22:57, 7 April 2013 (UTC)[reply]
Thanks a lot, everyone; these were really excellent answers! BTW, the link to Frequentist inference led me to Fiducial inference, which under the bullet "A confidence interval, in frequentist inference, ..." expresses the same idea in a way I find even clearer than the first bullet in Frequentist inference: "The probability concerned is not the probability that the true value is in the particular interval that has been calculated since at that stage both the true value and the calculated are fixed and are not random." I find it very interesting that this question turned out to be a cornerstone of different schools of how to draw conclusions from samples of data. 03:13, 8 April 2013 (UTC)
(In response to Mmitchell10's unindented comment:) This is not an issue of frequentist obstinacy (the Bayesian approach has no issue with credence in the value of ) nor of the mire of Bayesian priors (how much did we really know about , anyway?): if we construct enough (say, 90%) confidence intervals from one population, we will eventually find that two of them are disjoint, at which point it is simply nonsense to say that, in any sense, there is a probability of lying in each. --Tardis (talk) 14:23, 12 April 2013 (UTC)[reply]