Talk:Theil–Sen estimator

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

tau[edit]

Quote: "As Sen observed, this estimator is the value that makes the Kendall tau rank correlation coefficient comparing the sample data values yi with their estimated values mxi + b become approximately zero."

Really? Then the method gives an estimation (mxi + b) completely uncorrelated with the estimated variable (yi)? Olaf (talk) 00:49, 27 April 2014 (UTC)[reply]

No, it means that roughly half the yi are greater than the corresponding mxi+b, and roughly half are less. Deltahedron (talk) 19:49, 27 April 2014 (UTC)[reply]
No, it's not median error supposed to be equal to zero as it would be in your interpretation, it's Kendall's tau rank correlation. Counterexample: if yi = xi, then the estimator mxi + b = 1xi + 0 = xi = yi and thus the tau correlation between the estimator mxi + b and the original value yi is equal to one, instead of zero. Olaf (talk) 20:07, 27 April 2014 (UTC)[reply]
That's not a particularly good counterexample, since the number of concordant and the number of discordant pairs are both zero, and hence tau=0. Deltahedron (talk) 20:11, 27 April 2014 (UTC)[reply]
Let's check: y1=1, y2=2, y3=3.
Estimations: Y1=1, Y2=2, Y3=3
Concordant pairs:
1<2 and y1 < Y2
1<3 and y1 < Y3
2<3 and y2 < Y3
Tied pairs: none
Discordant pairs: none.
Tau = 1
In absence of tied ranks the tau correlation has the same property as Pearson's correlation: tau(A,A) = 1, and we have no tied ranks, if ai <> aj when i<>j
Olaf (talk) 20:23, 27 April 2014 (UTC)[reply]
No, it's the residuals that are all equal and hence uncorrelated. Deltahedron (talk) 20:37, 27 April 2014 (UTC)[reply]
Yes, and the article supposed, it's the estimated values, not their residuals. Now it's fixed ([1]). Thank you for the references. Olaf (talk) 20:43, 27 April 2014 (UTC)[reply]
However, what's important is what independent reliable sources say. Searching "Theil Sen" "Kendall tau" in Google Books gave me: [2], [3], [4] which support the assertion of the text (unlike the reference to Rousseeuw & Leroy (2003), pp. 67, 164 which did not). Deltahedron (talk) 20:19, 27 April 2014 (UTC)[reply]
Ok, so it's tau correlation between estimation error and X value equal to zero, not between estimator and estimated value! (the second reference). Olaf (talk) 20:26, 27 April 2014 (UTC)[reply]
Thanks for clearing this up. —David Eppstein (talk) 22:36, 27 April 2014 (UTC)[reply]

Bias[edit]

The statement on unbiasedness,

The Theil–Sen estimator is an unbiased estimator of the true slope in simple linear regression

is unfounded. The corresponding source explicitly states that Sen's claim to that effect is incorrect. It should be removed. Muhali (talk) 08:38, 14 February 2017 (UTC)[reply]

Just dug a little deeper. Their counterexample is built on asymmetric noise, which is somewhat rare, so maybe we just keep it the way it is stated now. Muhali (talk) 09:04, 14 February 2017 (UTC)[reply]

Accuracy of the estimated slope[edit]

The description seems to be of a kind of percentile bootstrap, but as far as I can see, this is incorrect. The procedure described here would yield a 95% interval for the sampled slopes, not (as it should) of their median. A reference for the described procedure is missing. Maybe someone has a good reference to a good way of doing this? (I don't have one handy now.) --Han691 (talk) 17:23, 19 August 2019 (UTC)[reply]