User:Quantile50/sandbox

From Wikipedia, the free encyclopedia

Example[edit]

Test for differences in ozone levels by month[edit]

The following example uses data from Chambers et al.[1] on daily readings of ozone for May 1 to September 30, 1973, in New York City. The data are in the R data set airquality, and the analysis is included in the documentation for the R function kruskal.test. Boxplots of ozone values by month are shown in the figure.

The Kruskal-Wallis test finds a significant difference (p = 6.901e-06) indicating that ozone differs among the 5 months.

kruskal.test(Ozone ~ Month, data = airquality)

	Kruskal-Wallis rank sum test

data:  Ozone by Month
Kruskal-Wallis chi-squared = 29.267, df = 4, p-value = 6.901e-06

To determine which months differ, post-hoc tests may be performed using a Wilcoxon test for each pair of months, with a Bonferroni (or other) correction for multiple hypothesis testing.

pairwise.wilcox.test(airquality$Ozone, airquality$Month, p.adjust.method = "bonferroni")


	Pairwise comparisons using Wilcoxon rank sum test 

data:  airquality$Ozone and airquality$Month 

  5      6      7      8     
6 1.0000 -      -      -     
7 0.0003 0.1414 -      -     
8 0.0012 0.2591 1.0000 -     
9 1.0000 1.0000 0.0074 0.0325

P value adjustment method: bonferroni

The post-hoc tests indicate that, after Bonferroni correction for multiple testing, the following differences are significant (adjusted p < 0.05).

  • Month 5 vs Months 7 and 8
  • Month 9 vs Months 7 and 8
  1. ^ John M. Chambers, William S. Cleveland, Beat Kleiner, and Paul A. Tukey (1983). Graphical Methods for Data Analysis. Belmont, Calif: Wadsworth International Group, Duxbury Press. ISBN 053498052X.{{cite book}}: CS1 maint: multiple names: authors list (link)