According to the help file for 'outlier'  ,  (quoting)

x a data sample, vector in most cases. If argument is a dataframe, then outlier is
calculated for each column by sapply. The same behavior is applied by apply
when the matrix is given.  (endquote)

Looks like you could create a matrix that looks like an "upper triangular" like

1       1       1
NA      2       2
NA      NA      3

and see the results. However, since 'outlier' just returns the value furthest from the mean, this doesn't really provide much information. If I were to write a function to find "genuine" outliers, I would do something like

x[ abs(x-mean(x)) > 3*sd(x)] , thus returning all values more than 3-sigma from the mean.



<quote>

I would like to find data points that at least should be checked one more time before I process them further. I've had a look at the outliers package for this, and the outliers function in that package, but it appears to only return one value.

An example:

> outlier(c(1:3,rnorm(1000,mean=100000,sd=300)))
[1] 1

I think at least 1,2 and 3 should be checked in this case.

Any ideas on how to achieve this in R?

Actually, the real data I will be investigating consist of vector norms and angles (in an attempt to identify either very short, very long vectors, or vectors pointing in an odd angle for the category to which it has been assigned) so a 2D method would be even better.

I would much appreciate any help I can get on this,


--

Sent from my Cray XK6
"Pendeo-navem mei anguillae plena est."

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to