Hello:

I found the "hubers" function in MASS library is NOT working on the following 
data:

> a <- 
>c(7.19,7.19,7.19,9.41,6.79,9.41,7.19,9.41,1.64,7.19,7.19,7.19,7.19,1.422,7.19,6.79,7.19,6.79,7.19,7.19,4.44,6.55,6.79,7.19,9.41,9.41,7.19,7.19,7.19,7.19,1.64,1.597,1.64,7.19,1.422,7.19,6.79,9.38,7.19,1.64,7.19,7.19,7.19,7.19,7.19,1.64,7.19,6.79,6.79,1.649,1.64,7.19,1.597,1.64,6.55,7.19,7.19,1.64,7.19,7.19,1.407,1.672,1.672,7.19,6.55,7.19,7.19,9.41,1.407,7.19,7.19,9.41,7.19,9.41,7.19,7.19,7.19,7.19,7.19,7.19,7.19,7.19,7.19,9.41,7.19,6.79,7.19,6.79,1.64,1.422,7.19,7.19,1.67,1.64,1.64,1.64,1.64,1.787,7.19,7.19,7.19,6.79,7.19,7.19,7.19,7.19,7.19,7.19,7.19,7.19,7.19,1.64,1.64,1.64,1.422,9.41,1.64,7.19,7.19,7.19,7.19,7.19,7.19,7.19,6.79,6.79,7.19,6.79,7.19,7.19,1.407,7.19,4.42,9,1.64,1.64,6.79,1.664,1.664)
>

> library(MASS)
> hubers(a)
## NO response!

I think it is due to the infinite loop caused by the following line in the code 
of "hubers" (around Line 30):

if ((abs(mu0 - mu1) < tol * s0) && 
     abs(s0 - s1) < tol * s0)  break

where "s0" evaluates to ZERO initially (due to more than 50% of the number 
7.19). 


I propose to change the "<" sign to "<=":

if ((abs(mu0 - mu1) <= tol * s0) && 
     abs(s0 - s1) <= tol * s0)  break

which will break the infinite loop.    However, the new result is:

> hubers(a)
$mu
[1] 7.19

$s
[1] 0

which gives 0 standard deviation.  Actually the data does vary and it is not 
true all values other than 7.19 are outliers.   Take a look at:

> plot(a)

I think this is because we allow initial SD to equal to zero instead of a 
POSITIVE value.   See Line 15 of the "hubers" function:

    if (missing(s))
        s0 <- mad(y)

I propose setting "s0" to "mad(y)" or a small positive number, whichever is 
larger.  For example:

    if (missing(s))
        s0 <- max(mad(y), tol)

where tol=1e-6. With this change,  the result is more sensible:

> hubers(a)
$mu
[1] 5.88

$s
[1] 2.937

Could anyone take a look at this and decide if the above modifications to the 
"hubers" function are warranted?Thanks a lot! 


Sincerely, 
Feiming Chen

 Read more >>   Options >>   
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to