User talk:Kiefer.Wolfowitz/Archive 14

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 10

←

Archive 12

→

Copyright Violations, Plagiarism, etc.

Disinfopedia

I've looked at a few of those articles, and they seem ok - and then I noticed they were created on 20 August 2004, so hopefully any major problems will have been fixed since. I'll look at the rest as I get the chance, though. Black Kite (t) (c) 00:37, 31 July 2011 (UTC)

Hi BK!

Thanks for the quick reply.

My quick look at the 3 articles indicated that, in each case, the bulk of the present article had been input by the energetic IP that day. However, others' additions would make the quick-deletion nomination more of a headache. (I noted my BLP/RS/NPOV concerns, earlier.)

The spirit is willing to investigate more articles, but the flesh is indeed weak.

Cheers, Kiefer.Wolfowitz 00:43, 31 July 2011 (UTC)

I cleaned up the present version of Thomas L. Rhodes. Another editor had cut & pasted copyrighted material from National Review Online. (That IP's two copyright violations had been cleaned up.) I suppose that the history must be cleaned up. Kiefer.Wolfowitz 02:11, 31 July 2011 (UTC)

For future reference IP did about 15 more similar uploadings of articles from disinfopedia, on August 20, according to his edit summaries.

{{db-g12|url=sourceurl}} {{tl|close paraphrasing}} Kiefer.Wolfowitz 04:22, 31 July 2011 (UTC)

I tagged the problem ones as best as I could. Three additional copyright violations have been deleted, bring the total to 5, and two other articles are flagged as having copyright problems (with salvageable text otherwise).

As you had wished (above), the WP community did fix most of the articles. Kiefer.Wolfowitz 06:31, 31 July 2011 (UTC)

Committee on the Present Danger

I don't think the situation at Committee on the Present Danger is clear.

While the bulk of the current text matches what appears at this sitec, note the CC 3.0 license.

Perhaps you did, and were arguing that it is plagiarized, rather than a copyvio.

However, much of the material in the WP article was there in 2005, so it may be that the linked site copied the WP site. Whether the original WP material is valid is not yet clear.

I'm going to add the template that hides the content, and send this off to the copyright experts, unless you think I'm missing something.--SPhilbrick T 00:24, 2 August 2011 (UTC)

Please use the link I supplied, which is to the 1989 page, which has (in this case) been updated. The 1989 page was plagiarized/copied. And it still furnishes the bulk of the article. Kiefer.Wolfowitz 00:48, 2 August 2011 (UTC)

It was the original: " Posted: January 06, 1989" and "Updated: 7/89".

Thanks for your quick response. Kiefer.Wolfowitz 00:49, 2 August 2011 (UTC)

I copied this discussion to the CopyVio noticeboard, where the conversation started. Sphilbrick is correct that the CC 3.0 license allows copying, but only if appropriate credit is given. In these cases, it seems that the credit has not been given, and so at minimum extensive re-writing (inserting quotation marks, footnotes, etc.) is needed to bring the articles into compliance. Kiefer.Wolfowitz 02:29, 2 August 2011 (UTC)

The Signpost: 01 August 2011

News and notes: Wikimania; why Board of Trustees elections attract few votes; brief news

In the news: Consensus of Wikipedia authors questioned about Shakespeare authorship; 10 biggest edit wars on Wikipedia; brief news

Research interview: The Huggle Experiment: interview with the research team

WikiProject report: Little Project, Big Heart — WikiProject Croatia

Featured content: Featured pictures is back in town

Arbitration report: Proposed decision submitted for one case

Technology report: Developers descend on Haifa; wikitech-l discussions; brief news

Read this Signpost in full · Single-page · Unsubscribe · EdwardsBot (talk) 01:11, 2 August 2011 (UTC)

Carleman

Hi,

you have recently edited the article on Torsten Carleman, reducing the part devoted to alcohol abuse. I feel it can be further reduced (in favour of expanding the scientific part); esp. the part about antisemitic remarks. The citation (of Feller) says: "... partly because C. is of the touching opinion that one should execute all Jews and immigrants (which, however, he only tells his assistant after consuming a nonnegative amount of alcohol)". This is the only source I found in the internet mentioning antisemitic remarks by C, and even this could apply to a single episode (also note that Feller mentions one assistant, whereas the article is in plural). Especially, there is no evidence of any antisemitic actions on his part.

To my opinion, a footnote with exact citation (like in the article on Feller) would be more appropriate. (For comparison, the well-known antisemitic activity of dozen or more of Soviet mathematicians with a firm "record" -- e.g. Sergey Stechkin -- is not mentioned anywhere in the articles.)

What do you think?

Thanks,

Sasha (talk) 22:17, 4 August 2011 (UTC)

Hi! please edit it as you see fit. I have not looked at the recent books on Swedish far-right and Nazi organiations during the 1930s and 1940s, to check for Carleman. Kiefer.Wolfowitz 18:18, 5 August 2011 (UTC)

please have a look (at the revised version). I have not looked at these books either -- do you know a precise ref.? -- but honestly, I find it hard to believe he was a member of a Nazi organisation (and if he would, this would probably have been well known by now).

Best,

Sasha (talk) 00:25, 6 August 2011 (UTC)

Hi Sasha!

According to the oral history of Swedish probabilists and statisticians, anti-semitic Carleman blocked William Feller from a professorship because he was Jewish. Someday, somebody will look at the archives of Stockholm/Uppsala Universities, Carleman, Feller, and Cramér and write a scandalous history.

I did not say that he was a Nazi. There were many professors who were members of nationalist organizations, some of which had fascist ties, in the 1930s. It would be useful to look at the books that name names.

Best regards, Kiefer.Wolfowitz 07:16, 6 August 2011 (UTC)

Hi again! Your edits look fine. The quote from Feller says enough, and I am glad that you left in the article. Cheers, Kiefer.Wolfowitz 08:58, 6 August 2011 (UTC)

About anti-semitism back in the U.S.S.R., you may wish to consult Freiman's "It Seems That I Am a Jew", which names names. It contains an appendix signed by 10-20 of the most renowned emigrants affirming that the anti-semitism described was real and further naming names. Kiefer.Wolfowitz 09:01, 6 August 2011 (UTC)

thanks! I did not know this story (about Feller) -- indeed, his letter hints this, but I have never heard it in other places.

There are indeed lots of ref-s about Soviet antisemitism (e.g. "You Failed Your Math Test, Comrade Einstein: Adventures and Misadventures of Young Mathematicians, Or Test Your Skills in Almost Recreational Mathematics", which is a story about Bella Subbotovskaya and her environment). Are there any books about antisemitism/ fascist sympaties in the Swedish universities at the 30-s - 40-s (is that what you proposed to check)?

Best regards,

Sasha (talk) 15:02, 6 August 2011 (UTC)

Hi again, Sasha!

I looked at one book of Swedish Nazi/Extreme-Nationalist members some years ago, and saw that a dear friend's great grandfather was listed (as I had feared, given the inherited books on phrenology and racial types in the family), and didn't want to look more. Sweden and in fact Uppsala University had a world-leading center for racial science, and eugenic sterilization was practiced until 1970 or so---more on persons with disabilities---but apparently somewhat on nonconformists, or Samis/Laps or Gypsies. I get depressed already by the contemporary Anti-Imperialism of Fools and the "classic" Socialism of Fools, which has never gone out of style in Europe, that I don't have the energy to dig around the past. There is enough to do with trying e.g., to reduce Jew-baiting on Wikipedia.

You might ask at the WikiProject Sweden for help, though. Or at the talk page for the Carleman article (or better for something about extreme right wing politics in Sweden in the 1930s ...) on on Swedish Wikipedia. My guess is that the Swedish Universities of the 1930s were not as cheerfully liberal, cosmopolitan, and welcoming as the brandy-sifting British ruling class portrayed in Remains of the Day. I think that all the political parties committed anti-semitic and nationalistic sins in the 1930s.

About discrimination in the 1970s in the USSR, there were good articles in the Mathematical Intelligencer and Notices of the AMS, also. Boris Polyak made a good and appropriately brief acknowledgment of "shameful" history in his lecture on "Optimization in the U.S.S.R", which I cited on Kantorovich's article (for its other virtues); it cites a few good sources, briefly.

Sincerely and striving to avoid misanthropy, Kiefer.Wolfowitz 17:32, 6 August 2011 (UTC)

Hi again, and thanks for the interesting conversation.

Returning to wiki articles about antisemitic activity of mathematicians: my modest opinion is that this subject is unworthy of excessive wikification. As to Carleman, time will "pardon him for writing well" (and proving well).

Best regards,

Sasha (talk) 19:10, 6 August 2011 (UTC)

You are right. I think it's okay to have the one quotation from Feller, for now, unless somebody writes an article on Carleman. Carleman was a saint compared to Bierberbach, et alia .... Kiefer.Wolfowitz 19:17, 6 August 2011 (UTC)

last post here, but fyi

Closed discussion

hello. Not sure why you're angry. Not sure why you deleted my question. I won't post here ever again, but this is just to let you know that if you thought I was somehow... posting insincerely in order to bother you or something, you were mistaken. Cheers OneLeafKnowsAutumn (talk) 02:14, 10 August 2011 (UTC)

Hi "One Leaf Knows Autumn"!

First, I want to apologize for deleting your posting, which I now happily restore:

china forex purchases devalue reniminbi

China buys foreign debt in order to keep its currency cheap... how exactly does that work? What's the connect? OneLeafKnowsAutumn 12:27, 9 August 2011 (UTC)

Reply, continued

Let me explain my dumb deletion:

I have poor eye-sight and ignorance, and reniminbi looked a bit like the SPAM I get at my e-mail.
I have pressed for time professionally. At Wikipedia I have been dealing with personal attacks and complaints (some justified) for days, from a tag-team group of antagonists (one of whom has a batting average of good decisions perhaps better than mine, although he was in a weekend slump).

This is a question for a real economist, who should refer you to a good reference. Try the WikiProjects on Economics or Business, or better the help desk (because the projects are for discussing articles).

May I ask how you located me? Not that I am fishing for compliments ... but as I mentioned the ratio of compliments to "Kiefer.W is an abusive asshole" statements has dipped lately ...

P.S. Your user-name may be the most evocative and beautiful on WP.

Kiefer.Wolfowitz 03:00, 10 August 2011 (UTC)

Dear Sir, thank you for your kind response. One leaf knows autumn (一葉知秋) is a chéngyǔ that people say to describe any sign foretelling things to come. I think I saw your name on some economics article somewhere here on Wikipedia. As for people calling you an abusive posterior orifice, I find that ignoring such people tends to make them lose interest and fade away. OneLeafKnowsAutumn (talk) 07:24, 10 August 2011 (UTC)

Dear 一葉知秋,

Your very interesting note, which provides a delightfully Bayesian euphemism for my crudity, also makes me wonder whether my user page may have been displaying a chéngyǔ:

Avoiding conflicts of interest: Tan Bu De Sheng （贪不得胜）—Do not be greedy!

:-)

I should have remembered another Go proverb this week:

Shi Gu Qu He （势孤取和） - Look for peace, avoid fighting in an isolated or weak situation

It seems the classical Wei-Chi proverbs are all chéngyǔ!

Best regards, Kiefer.Wolfowitz 08:09, 10 August 2011 (UTC)

Bayesian vs. Frequentist vs. Likelihood

Closed discussion

As a person with only a working knowledge of frequentist statistics, I find the internet to be a bit scarce when it comes to good (and comprehensible) reference articles that contrast these three sub-disciplines. Since you are a statistician and like to add content in here, maybe that is something you can contribute to along with your hated enemies from Bayesian and Likelihood schools. --Bobthefish2 (talk) 21:52, 11 August 2011 (UTC)

The sectarian mind, needing to pigeon-hole statisticians in 2-4 categories and to view the neighboring pigeon-holes as wicked, is loathsome and would be pitiable were it not so prevalent, in some areas, alas.

David Cox is a wonderful person, often associated with "likelihood" approaches, and he kindly described himself as an "ardent Bayesian" and "fervent Neyman-Pearsonian", to JRSS/D "The Statistician", after a former student of his (Jim Lindsey) was less than prudent in his enthusiasms (around 2000 if my memory is correct).

We should do the same. I would rather read Laplace, Peirce, Fisher, Neyman, Kolmogorov, Cox, even when they use Bayesian methods than I would frequentist Johnny-come-latelies in the lastest version of JASA.

It is much more useful to judge statisticians, like religious persons, by the content of their character, by their honest and helpful words, and the fruit of their labors, rather than by the label of their "school", particularly "schools" that were already embarrassing 80 years ago. Kiefer.Wolfowitz 22:13, 11 August 2011 (UTC)

My specialty is not statistics so whatever I know about this field is mostly from rumours :). My supervisor (who's actually a mathematician along with many other things) used to tell me that big white-bearded statisticians would fight passionately over these things.

But to be honest, I really don't get what's the big deal about all this. I remember reading ages ago about a coin-flipping example and that a Frequentist would predict the p by maximizing the likelihood and a Bayesian would predict the p by maximizing the posterior with a Dirichlet prior (supposedly because it is convenient to use. wtf?). I'd assume Frequentists would also maximize the posterior if the prior distribution is also known, but I am not really sure.

Anyway, if you feel this divide is irrelevant, maybe you should write an article to tell people why it is irrelevant. --Bobthefish2 (talk) 22:56, 11 August 2011 (UTC)

I did describe what every intermediate course is supposed to mention, and it was reverted as nonsense. So I wrote the following:

Statistics, since 1950

A decision-theoretic justification of the use of Bayesian inference was given by Abraham Wald,^{[citation needed]} who proved that every Bayesian procedure is admissible.^{[citation needed]} Conversely, every admissible statistical procedure is either a Bayesian procedure or a limit of Bayesian procedures.^[1]

Wald's result also established the Bayesian approach as a fundamental technique in such areas of frequentist inference as point estimation, hypothesis testing, and confidence intervals. Wald characterized admissible procedures as Bayesian procedures (and limits of Bayesian procedures), making the Bayesian formalism a central technique in such areas of frequentist statistics as parameter estimation, hypothesis testing, and computing confidence intervals.^[2] For example:

"Under some conditions, all admissible procedures are either Bayes procedures or limits of Bayes procedures (in various senses). These remarkable results, at least in their original form, are due essentially to Wald. They are useful because the property of being Bayes is easier to analyze than admissibility."^[1]

"In decision theory, a quite general method for proving admissibility consists in exhibiting a procedure as a unique Bayes solution."^[3]

"In the first chapters of this work, prior distributions with finite support and the corresponding Bayes procedures were used to establish some of the main theorems relating to the comparison of experiments. Bayes procedures with respect to more general prior distributions have played a very important in the development of statistics, including its asymptotic theory." "There are many problems where a glance at posterior distributions, for suitable priors, yields immediately interesting information. Also, this technique can hardly be avoided in sequential analysis."^[4]

"A useful fact is that any Bayes decision rule obtained by taking a proper prior over the whole parameter space must be admissible"^[5]
"An important area of investigation in the development of admissibility ideas has been that of conventional sampling-theory procedures, and many interesting results have been obtained."^[6]

^ ^a ^b Bickel & Doksum (2001, page 32)
^
* Kiefer, J. and Schwartz, R. (1965). "Admissible Bayes character of T²-, R²-, and other fully invariant tests for multivariate normal problems". Annals of Mathematical Statistics. 36: 747–770. doi:10.1214/aoms/1177700051.{{cite journal}}: CS1 maint: multiple names: authors list (link)
- Schwartz, R. (1969). "Invariant proper Bayes tests for exponential families". Annals of Mathematical Statistics. 40: 270–283. doi:10.1214/aoms/1177697822.
- Hwang, J. T. and Casella, George (1982). "Minimax confidence sets for the mean of a multivariate normal distribution". Annals of Statistics. 10: 868–881. doi:10.1214/aos/1176345877.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Lehmann, Erich (1986). Testing Statistical Hypotheses (Second ed.). (see page 309 of Chapter 6.7 "Admissibilty", and pages 17–18 of Chapter 1.8 "Complete Classes"
^ Le Cam, Lucien (1986). Asymptotic Methods in Statistical Decision Theory. Springer-Verlag. ISBN 0387963073. (From "Chapter 12 Posterior Distributions and Bayes Solutions", page 324)
^ Cox, D. R. and Hinkley, D. V (1974). Theoretical Statistics. Chapman and Hall. ISBN 0041215370.{{cite book}}: CS1 maint: multiple names: authors list (link) page 432
^ Cox, D. R. and Hinkley, D. V (1974). Theoretical Statistics. Chapman and Hall. ISBN 0041215370.{{cite book}}: CS1 maint: multiple names: authors list (link) page 433)

Likelihood heuristics are not frequentist

The "Likelihood" school has all the problems of Bayesian inference and few of its virtues.

Finding a zero of the derivative of the likelihood "function"

(and the likelihood "function" is usually not defined as a function unless a Bayesian approach is used)

is preferable to "maximum" likelihood estimation in many cases, and certainly for a general asymptotic theory, and it

is only a heuristic, with only asymptotic virtues (inferior to the asymptotic theory of maximum posterior estimation, as explained by Ferguson).

P.S. A quote from the end of the article on likelihood function:

"In many writings by Charles Sanders Peirce, model-based inference is distinguished from statistical procedures based on objective randomization. Peirce's preference for randomization-based inference is discussed in "Illustrations of the Logic of Science" (1877–1878) and "A Theory of Probable Inference" (1883)".^{[citation needed]}

"probabilities that are strictly objective and at the same time very great, although they can never be absolutely conclusive, ought nevertheless to influence our preference for one hypothesis over another; but slight probabilities, even if objective, are not worth consideration; and merely subjective likelihoods should be disregarded altogether. For they are merely expressions of our preconceived notions" (7.227 in his Collected Papers^{[citation needed]}).

"But experience must be our chart in economical navigation; and experience shows that likelihoods are treacherous guides. Nothing has caused so much waste of time and means, in all sorts of researchers, as inquirers' becoming so wedded to certain likelihoods as to forget all the other factors of the economy of research; so that, unless it be very solidly grounded, likelihood is far better disregarded, or nearly so; and even when it seems solidly grounded, it should be proceeded upon with a cautious tread, with an eye to other considerations, and recollection of the disasters caused." (Essential Peirce^{[citation needed]}, volume 2, pages 108–109)"

The neo-Fisherian "method" of "testing hypotheses" on the data generating them was labeled the most dangerous fallacy of induction by Peirce. (Maximum-likelihood estimation was the most popular fallacy!)^[1] Reasoning and the Logic of Things (RLT) (The 1898 Lectures in Cambridge, MA)

^
Pages 194-196 in
- Peirce, Charles Sanders, Reasoning and the Logic of Things, The Cambridge Conference Lectures of 1898, Kenneth Laine Ketner, ed., intro., and Hilary Putnam, intro., commentary, Harvard, 1992, 312 pages, hardcover (ISBN 978-0674749665, ISBN 0674749669), softcover (ISBN 978-0-674-74967-2, ISBN 0-674-74967-7) HUP catalog page.

This looks like a very very long read (and I have not even learned there is a "Fisherian" and "Pearsonian" school of thought). Do you think the main differences can be illustrated with a simple defining example? --Bobthefish2 (talk) 23:31, 11 August 2011 (UTC)

Example

Consider a coin toss. We agree to wait for Jimbo Wales at a Wikimania convention, and we ask one of his disciples for a penny, because it will cure us from our lack of "the sum of all human knowledge".

We wish to use this penny in the future, to break the Methodist Book of Discipline by gambling, so we want to know how fair it is.

A Bayesian could tell you that he thinks its probability is 50% without doing any computations.

The frequentist could give you median/mean-unbiased estimators if you allow him to flip the coin (as could a confused and cowardly Bayesian who is afraid to be called "subjective" and so mumbles "likelihood").

Suppose we flip the coin once. It is heads. The frequentist statistician gives you a mean unbiased estimate of 1, as would the likelihood enthusiast. The Bayesian tells you that his estimate is pretty close to 0.5 still, but now is greater than 0.5.

I ask you, if you cared about using this estimate in practice, which of these estimates would you use? A Bayesian statistician could give you a true probability distribution on the parameter space, the interval [0,1], which you could use to do simulations.

We could do more flipping, and we would find that after 30 throws, there wouldn't be much difference between them. Asymptotically, all of the estimators will agree, but the Bayesian estimator (say median posterior) is robust and useful for small sample-sizes and can be used honestly by practitioners wanting to do simulations (and unwilling to pick a single number for their parameters).

The likelihood approach and the Bayesian approach rely on probability models, which are always wrong (apart from electron emissions, etc.), and which almost always are so bad that nobody bothers updating the posterior when more data arrives. Usually, scientists just improve the measurements by improved experimental technique; the jaw-boning about n goes to infinity is just irrelevant to scientific practice (as Peirce noted long ago).

It is better to use design-based inference, using the randomization specified in the sampling/experimental design, than to put up a parametric model, if possible. If inference relies on a model, warning labels should be attached, imho.

The posterior median and median-unbiased estimators are invariant under reparametrization. A ML estimator (if defined with some initialization) is invariant somewhat in a weaker sense. The mean-unbiased estimator is not invariant under reparameterization. Kiefer.Wolfowitz 23:54, 11 August 2011 (UTC)

Is it typical for a Bayesian to say the probability is 0.5 without any observation? While a uniform prior is popular, I get the impression that that Bayesians simply like to make guesses that Frequentists don't like to make. What's the core difference between the Likelihood people and the Bayesians then?

By the way, an example of this form can be re-posted on one of the stats pages. I think it will be a pretty popular read since a lot of us non-statisticians are quite curious about these things. :) --Bobthefish2 (talk) 01:18, 12 August 2011 (UTC)

Hi Bob,

Thanks for the compliment and smiley face. It could but copied I would prefer not. Even here, we are getting away from discussing improving articles and leaning towards a help page and discussion form.

I assume that some form of symmetry would be usual for standard for most practitioners. A professional magician and statistician like Persi Diaconis probably has an extremely sophisticated prior; I believe he specified one in an article at least once. It was a mixture of three distributions, if I recall.

Likelihood people assume that

the antecedent likelihood doesn't matter, which is a crazy assumption usually, which also means they don't get a probability distribution as a result, so their results are not directly applicable for predictive inference/decisions.
They also have a curious belief that 30=infinity (where "30" is whatever sample size they have in the "data at hand"), a delusion that justifies using "asymptotic" results. ;) Kolmogorov, who presumably knew something about logic and probability, noted the inapplicability of limiting results in his article ("Tables of Random Numbers").

I hope you won't report me to ANI/AN for noting the relevance of paraconsistent logic to such "statistical theory"!

I hope you know I'm being a bit tough on the likelihood enthusiasts. (See my previous endorsement of David Cox, etc.) Their 30=infinity equation must be based on some (usually implicit) assumption that the sampling distribution for 30 is a good approximation to the limiting distribution. But this is a big assumption, particularly for the enormous models we're seeing more in more, based on more and more arbitrary assumptions, in the "likelihood approach"! Kiefer.Wolfowitz 01:36, 12 August 2011 (UTC)

I wouldn't say that's "help page" material since these distinctions should be a pretty important part of statistics that deserve to be documented.

I did manage to take a look at the likelihood page [1]. At a first glance, it seemed ridiculous, but after looking at the example, I'd say it's pretty neat. The Bayesians probably would be foaming out of their mouths because there is a missing Dirichlet prior. :)

By the way, is the 30=infinity assumption really a Likelihood thing? I thought I've seen it used quite commonly. --Bobthefish2 (talk) 03:42, 12 August 2011 (UTC)

The likelihood principle begins with a big "if", that the model be correct. In non-paraconsistent logic, beginning with a falsehood immediately causes trouble, so the likelihood principle has nothing to say about practice. It is invoked by people who love their tools, the MLE and LRT, and build their theory to justify their tools.

Many Bayesian "theorists" like the likelihood principle, because Bayesian procedures automatically satisfy it, but again, this is the tool wagging the theory.

Because the MLE has no optimal finite-sample properties, in general, the Fisherian cult needs to invest in asymptotic theory, and so you have asymptotic theory courses in every statistics department---again, the tools wagging the theory and the science. Much of these courses and graduate programs are scientifically deadly: LeCam's theory and Van Der Vaart's book should only be read by mathematicians; I've seen economists have their minds ruined by a semester with van der Vaart! ;)

It would be better if statisticians would replace "asymptotic" by "scientifically irrelevant" for a few decades, to reduce the damage to scientific practice of "doubly robust" procedures, etc.

Don't take this too seriously! ;)

Cheers, Kiefer.Wolfowitz 04:01, 12 August 2011 (UTC)

Don't worry. I am only treating this as pleasure reading. Most of the statistics I see in scientific literature (of my field and adjacent subject areas) do not get beyond the first 2 undergraduate statistics courses. I agree that delving too deeply into certain scientific schools of thoughts can be harmful to one's perspective. We also have these kinds of phenomena over here in molecular biology especially in areas dealing with evolution and structures (complicated story). --Bobthefish2 (talk) 04:35, 12 August 2011 (UTC)

Evolutionary genetic biology is a funny place for statisticians! Too bad Samuel Karlin isn't around to yell at biologists anymore! Kiefer.Wolfowitz 04:38, 12 August 2011 (UTC)

Nonparametrics

This book is a good overview of how much statistics can be done, and done very well, without any parametric model:

Hettmansperger, T. P.; McKean, J. W. (1998). Robust nonparametric statistical methods. Kendall's Library of Statistics. Vol. 5 (First ed.). London: Edward Arnold. pp. xiv+467 pp. ISBN 0-340-54937-8, 0-471-19479-4. MR 1604954. {{cite book}}: Check |isbn= value: invalid character (help); Cite has empty unknown parameter: |1= (help); Unknown parameter |location2= ignored (help); Unknown parameter |publisher2= ignored (help)

Such methods don't give probability models that can be used for predictive inference and decisions, though. Kiefer.Wolfowitz 00:15, 12 August 2011 (UTC)

Statistics since 1950

You should beware of any survey of inference (most unfortunately) which doesn't deal with the following concepts (about which I write in "Statistical inference"). Like any good mathematical theory, it has links to concepts in mathematics and related mathematical sciences (communication theory, computer science, physics, etc.), it improves our understanding of previous results, and raises new questions. Kiefer.Wolfowitz 00:08, 12 August 2011 (UTC)

Information and computational complexity

Other forms of statistical inference have been developed from ideas in information theory^[1] and the theory of Kolmogorov complexity.^[2] For example, the minimum description length (MDL) principle selects statistical models that maximally compress the data; inference proceeds without assuming counterfactual or non-falsifiable 'data-generating mechanisms' or probability models for the data, as might be done in frequentist or Bayesian approaches.

However, if a 'data generating mechanism' does exist in reality, then according to Shannon's source coding theorem it provides the MDL description of the data, on average and asymptotically.^[3] In minimizing description length (or descriptive complexity), MDL estimation is similar to maximum likelihood estimation and maximum a posteriori estimation (using maximum-entropy Bayesian priors). However, MDL avoids assuming that the underlying probability model is known; the MDL principle can also be applied without assumptions that e.g. the data arose from independent sampling.^[3]^[4] The MDL principle has been applied in communication-coding theory in information theory, in linear regression, and in time-series analysis (particularly for chosing the degrees of the polynomials in Autoregressive moving average (ARMA) models).^[4]

Information-theoretic statistical inference has been popular in data mining, which has become a common approach for very large observational and heterogeneous datasets made possible by the computer revolution and internet.^[2]

The evaluation of statistical inferential procedures often uses techniques or criteria from computational complexity theory or numerical analysis.^[5]^[6]

^ Soofi (2000)
^ ^a ^b Hansen & Yu (2001)
^ ^a ^b Hansen and Yu (2001), page 747.
^ ^a ^b Rissanen (1989), page 84
^ Joseph F. Traub, G. W. Wasilkowski, and H. Wozniakowski. (1988) ^{[page needed]}
^ Judin and Nemirovski.

Freedman, David A. (2009). Statistical models: Theory and practice (revised ed.). Cambridge University Press. pp. xiv+442 pp. ISBN 978-0-521-74385-3. MR 2489600.
Hansen, Mark H.; Yu, Bin (2001). "Model Selection and the Principle of Minimum Description Length: Review paper". Journal of the American Statistical Association. 96 (454): 746–774. doi:10.1198/016214501753168398. JSTOR 2670311. MR 1939352. {{cite journal}}: Unknown parameter |month= ignored (help)
Rissanen, Jorma (1989). Stochastic Complexity in Statistical Inquiry. Series in computer science. Vol. 15. Singapore: World Scientific. ISBN 9971508591. MR 1082556.
Soofi, Ehsan S. (2000). "Principal Information-Theoretic Approaches (Vignettes for the Year 2000: Theory and Methods, ed. by George Casella)". Journal of the American Statistical Association. 95 (452): 1349–1353. JSTOR 2669786. MR 1825292. {{cite journal}}: Unknown parameter |month= ignored (help)
Traub, Joseph F.; Wasilkowski, G. W.; Wozniakowski, H. (1988). Information-Based Complexity. Academic Press. ISBN 0126975450.

Interesting. I didn't actually realize MAP estimation is necessarily a Bayesian approach. I thought the frequentist school of thought does allow a prior to be used if its distribution can be sampled? --Bobthefish2 (talk) 01:33, 12 August 2011 (UTC)

It is a bad idea to discuss "frequentist" and usually better to use "sampling-distribution based".

De Finetti defined probability distributions so that they could be verified by observing sequences of experiments and falsified by observing finite sequences of experiments, so again some Bayesian ideas are frequentist (more precisely than so-called "frequentist" statistics).

Oscar Kempthorne helped to invent MAP (or was it a variant of conjugate gradient methods---senility strikes me?). He was hard core on randomization, but he acknowledged the place of Bayesian statistics in predictive inference and decisions. Kiefer.Wolfowitz 01:44, 12 August 2011 (UTC)

[Bickel_&_Doksum_2001,_page_32-1] Bickel & Doksum (2001, page 32)

[2] * Kiefer, J. and Schwartz, R. (1965). "Admissible Bayes character of T²-, R²-, and other fully invariant tests for multivariate normal problems". Annals of Mathematical Statistics. 36: 747–770. doi:10.1214/aoms/1177700051.{{cite journal}}: CS1 maint: multiple names: authors list (link)
Schwartz, R. (1969). "Invariant proper Bayes tests for exponential families". Annals of Mathematical Statistics. 40: 270–283. doi:10.1214/aoms/1177697822.

Hwang, J. T. and Casella, George (1982). "Minimax confidence sets for the mean of a multivariate normal distribution". Annals of Statistics. 10: 868–881. doi:10.1214/aos/1176345877.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[3] Schwartz, R. (1969). "Invariant proper Bayes tests for exponential families". Annals of Mathematical Statistics. 40: 270–283. doi:10.1214/aoms/1177697822.

[4] Hwang, J. T. and Casella, George (1982). "Minimax confidence sets for the mean of a multivariate normal distribution". Annals of Statistics. 10: 868–881. doi:10.1214/aos/1176345877.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[3] Lehmann, Erich (1986). Testing Statistical Hypotheses (Second ed.). (see page 309 of Chapter 6.7 "Admissibilty", and pages 17–18 of Chapter 1.8 "Complete Classes"

[4] Le Cam, Lucien (1986). Asymptotic Methods in Statistical Decision Theory. Springer-Verlag. ISBN 0387963073. (From "Chapter 12 Posterior Distributions and Bayes Solutions", page 324)

[5] Cox, D. R. and Hinkley, D. V (1974). Theoretical Statistics. Chapman and Hall. ISBN 0041215370.{{cite book}}: CS1 maint: multiple names: authors list (link) page 432

[6] Cox, D. R. and Hinkley, D. V (1974). Theoretical Statistics. Chapman and Hall. ISBN 0041215370.{{cite book}}: CS1 maint: multiple names: authors list (link) page 433)

[7] Pages 194-196 in
Peirce, Charles Sanders, Reasoning and the Logic of Things, The Cambridge Conference Lectures of 1898, Kenneth Laine Ketner, ed., intro., and Hilary Putnam, intro., commentary, Harvard, 1992, 312 pages, hardcover (ISBN 978-0674749665, ISBN 0674749669), softcover (ISBN 978-0-674-74967-2, ISBN 0-674-74967-7) HUP catalog page.

[10] Peirce, Charles Sanders, Reasoning and the Logic of Things, The Cambridge Conference Lectures of 1898, Kenneth Laine Ketner, ed., intro., and Hilary Putnam, intro., commentary, Harvard, 1992, 312 pages, hardcover (ISBN 978-0674749665, ISBN 0674749669), softcover (ISBN 978-0-674-74967-2, ISBN 0-674-74967-7) HUP catalog page.

[Soofi_2000_1349–1353-8] Soofi (2000)

[HY-9] Hansen & Yu (2001)

[HY747-10] Hansen and Yu (2001), page 747.

[JR-11] Rissanen (1989), page 84

[12] Joseph F. Traub, G. W. Wasilkowski, and H. Wozniakowski. (1988) ^{[page needed]}

[13] Judin and Nemirovski.

[1]

[2]

[3]

[4]

[5]

[6]

[1]

[1]

[2]

[3]

[4]

[5]

[6]