Wikipedia:Reference desk/Archives/Mathematics/2010 April 8

From Wikipedia, the free encyclopedia
Mathematics desk
< April 7 << Mar | April | May >> April 9 >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


April 8[edit]

Biased dice[edit]

I have a probability question.
Firstly, imagine a biased dice is rolled 10 times and the results are as follows: 2, 4, 2, 1, 4, 4, 3, 4, 3, 3

One could estimate the probability of each particular value being rolled simply by dividing the number of occurences of that value by the total number of times that the dice was rolled. I believe this is known as a binomial distribution. However, my understanding is that a binomial distribution requires that the probability of a trial is independent of the trial's position.

I would like to know if there is a way of estimating the probability of each value being rolled in a particular trial if the probability is dependent on the position of the trial. Also, I would like to know if there is a way of estimating the probability of each value being rolled if the probability is dependent on the value rolled in the previous trial.--Alphador (talk) 09:21, 8 April 2010 (UTC)[reply]

The likelihood function of the six unknown probabilities is a Dirichlet distribution. Bo Jacoby (talk) 10:06, 8 April 2010 (UTC).[reply]
Huh??? A likelihood function is not a distribution at all. And the Dirichlet distribution has nothing to do with this likelihood function. The crude estimator given by the original poster is indeed the maximum likelihood estimate.
The original poster is confused as to the meaning of the term binomial distribution. The binomial distribution is the probability distribution of the number of successes in a fixed number of independent Bernoulli trials. Michael Hardy (talk) 19:54, 8 April 2010 (UTC)[reply]
The likelihood function is transformed into a Bayesian probability distribution by normalizing with trivially constant prior probabilities. Bo Jacoby (talk) 17:37, 9 April 2010 (UTC).[reply]
For the second question, the same technique works even without assumption of independence, but you may need more trials to get an accurate answer.
If you want more information about the system, you can build a 6x6 table, describing the number of times the roll was i when the previous roll was j. Then you can estimate the probability of every result given the previous result. If you then want to find the proportion of each result in a long run, this is basically the stationary distribution of a Markov chain.
The first question is a bit vague, and can range from virtually impossible to trivial. Can you clarify which kind of dependence you allow? -- Meni Rosenfeld (talk) 10:21, 8 April 2010 (UTC)[reply]
For example, suppose the value of the dice could be n, n+1, or n-1, where n is the position of the trial, and each of those three possibilities have an equal probability. However, you do not know this. Instead, you are only given a sample sequence, e.g. 2, 2, 3, 3, 6, 5, 7, 9. From that, determine the probability distribution, or an approximation thereof, for the nth term.--Alphador (talk) 11:17, 8 April 2010 (UTC)[reply]
Is it possible to conduct multiple experiments, so in the first experiment you get 2,2,3,3,6,5,7,9 and in the second experiment you get 1,3,2,5,4,7,6,8? If so, you can do many experiments and record the results for each position. Otherwise it becomes more difficult, and you'll have to enforce strong assumptions about the simplicity of the rule. For example, if you assume that each roll is evenly distributed between and , you can estimate a, b and c. It can be more relaxed - you can assume any rule is possible, but more complicated rules are less likely - but it probably won't give good results if you don't have any additional information. -- Meni Rosenfeld (talk) 12:08, 8 April 2010 (UTC)[reply]
Maybe you are asking how to estimate the parameters of a Hidden Markov model (HMM). If you can describe the actual application you are thinking of, that might help get better answers. 66.127.52.47 (talk) 17:54, 8 April 2010 (UTC)[reply]

Vacuous truth[edit]

I attended a lecture on geometric probability by Gian-Carlo Rota at the Joint Mathematical Meetings in 1998. He read verbatim from prepared notes, later published here. On page 15, we read this:

I will engage for a minute in the kind of mathematical reasoning that physicists find unbearably pedantic just to show physicists that such reasoning does pay off. Let us ask ourselves the question: what is the value of the symmetric function of order zero of a set of n variables x1, x2, ... , xn, say e0(x1, x2, ... , xn)? I will give you the answer and will leave it to you to justify this answer after the lecture is over. The answer is, e0 = 1 if n > 0 (i.e., if the set of variables x1, x2, ... , xn is nonempty), and e0 = 0 if the set of variables is empty.

And then he shows how this leads us into the theory of the Euler characteristic!

I know one other sexy example, from applied statistics: the noncentral chi-square distribution with zero degrees of freedom is non-degenerate (IIRC it concentrates some probability at 0 and otherwise is continuous)—I should dig out the details; I haven't look at this is a while.

Are there other good examples of really substantial far-reaching consequences of such vacuities? Michael Hardy (talk) 20:09, 8 April 2010 (UTC)[reply]

The Dirac delta function was widely used in science and engineering despite not being a legitimate mathematical function at all. Trying to mathematically justify its use led to the development of distribution theory. Does that help? 66.127.52.47 (talk) 00:50, 13 April 2010 (UTC)[reply]
I don't see that that's an example of this sort of thing. Michael Hardy (talk) 03:35, 13 April 2010 (UTC)[reply]