Jump to content

Talk:Ratio estimator

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

The first variance formula after ‘The variance of the sample ratio is approximately:’ cannot be correct. It fails a simple dimensional analysis. The term m_y in the last parentheses must have dimension y^2, not y. Possibly a typo and m_y^2 should be there, but I do not have the original source so cannot verify it... 2001:67C:1220:6096:22CF:30FF:FEBD:AEC2 (talk) 11:57, 20 June 2017 (UTC)[reply]

Even the first correction formula fails basic dimensional analysis. The second term has the units of y, but ratio has the units of y/x. It cannot be correct for x and y which are not unitless. It was probably derived for some unitless quantities, for instance counts with Poisson distribution. Unfortunately, there is no citation, so it is hard to tell. 2001:67C:1220:6099:777E:4BCB:B4CA:6DEE (talk) 14:59, 20 March 2023 (UTC)[reply]

This article is very poorly written, and no assumptions are stated to justify the following claims:

  • "θy is known to be asymptotically normally distributed." - asymptotics requires a parameter going to infinity, which is never stated. Is the resulting normality a result of the Central Limit Theorem? If so, an independence assumption must be made, as well as a finite-variance assumption.
  • "E(x*1/y) = E(x)*E(1/y)" - this requires independence of x & y, which is never stated.

--65.209.72.194 (talk) 15:02, 25 July 2014 (UTC)[reply]

I bet the description of Lahiri's method in this article is wrong. I don't know Lahiri's method but I'm guessing it's just the Midzuno-Sen method using rejection sampling. If that's correct I would move the description of Midzuno-Sen before the description of Lahiri and replace the description of the Lahiri method with a brief statement that it's Midzuno-Sen using rejection sampling to pick the first item. 2620:0:1003:1019:24E6:C515:AC70:A1BB (talk) 18:30, 3 September 2015 (UTC)[reply]

I have corrected the application of Lahiri's method and fixed poor citations of a couple of other references. The Lahiri method is based upon the textbook by Lohr, cited. Incidently, Lahiri's method is not limited to ratio estimators but is a general sampling technique.

empirical_bayesian@ieee.org

This user is a member of WikiProject Statistics.

19:42, 1 July 2016 (UTC)


I indicated that the Lahiri estimator is biased and recommended that the Midzuno-Sen technique be used exclusively. See code below.

# Lahiri algorithm, own implementation. Jan Galkowski.
# empirical_bayesian@ieee.org, 3rd July 2016
# Last changed 3rd July 2016

is.natural<- function(x)
{
  x<- (0 < x) & (x == floor(x))
  return(x)
}

lahiri.sampling<- function(x, n, per=10)
{
  stopifnot(is.natural(per))
  stopifnot(all(is.natural(x)))
  M<- sum(x)
  stopifnot( is.natural(n) )
  N<- length(x)
  y.i<- rep(NA,n)
  y<- rep(NA,n)
  for (k in (1:n))
  {
    j<- sample(N, 1)
    z<- sample(M, 1)
    while( z > x[j] )
    {
      j<- sample(N, 1)
      z<- sample(M,1)
    }
    y.i[k]<- j
    y[k]<- x[j]
    if (0 == k%%per)
    {
      cat(sprintf("Lahiri sampling: Did %.0f\n", k))
    }
  }
  return(list(indices=y.i, sizes=y))
}

lahiri.Midzuno.Sen.sampling<- function(x, n)
{
  # Called this by Sarndahl, Swensson, and Wretman
  stopifnot(all(is.natural(x)))
  stopifnot( is.natural(n) )
  N<- length(x)
  y.i<- rep(NA,n)
  y<- rep(NA,n)
  p<- x/sum(x)
  y.i[1]<- sample.int(N, 1, prob=p)
  y[1]<- x[y.i[1]]
  y.i[2:N]<- sample((1:N)[-y.i[1]], (N-1), replace=FALSE)
  y[2:N]<- x[y.i[2:N]]
  return(list(indices=y.i, sizes=y)) 
}

# Test.

# General sample from a Gamma distribution with shape 2 and scale 10,
# meaning it has a mean of 20, and make sure it consists of positive 
# integers.

X<- ceiling(rgamma(10000, shape=2, scale=10))

# Empirical mean and median:

cat(sprintf("Mean[X]: %.3f, Median[X]: %.3f\n", mean(X), median(X)))

# Lahiri (runs for a while):

L<- lahiri.sampling(X, 100, per=20)


# Lahiri-Midzuno-Sen:

LMS<- lahiri.Midzuno.Sen.sampling(X, 100)

cat(sprintf("Lahiri Mean[X]: %.3f, Lahiri Median[X]: %.3f\n", mean(L$sizes), median(L$sizes)))

cat(sprintf("Lahari-Midzuno-Sen Mean[X]: %.3f, Lahiri-MidzunoSen Median[X]: %.3f\n", mean(LMS$sizes), median(LMS$sizes)))

empirical_bayesian@ieee.org

This user is a member of WikiProject Statistics.

15:39, 3 July 2016 (UTC)

Ogliore's formulas

[edit]

Ratio Estimation in SIMS Analysis R. C. Ogliore, G. R. Huss, K. Nagashima deals with Poisson distributions and the variables are unitless counts. All formulas coming from the paper only work with this assumptions. It is evident all the formula are *not* invariant to unit change and cannot be used for quantities which are not unitless counts. Please stop reverting edits that are trying to note the actual asumptions. 46.39.166.145 (talk) 08:22, 20 January 2022 (UTC)[reply]