Might work better to determine top and bottom for each column with quantile() using an appropriate quantile option, and then process each variable "in place" with your ifelse logic.

I did find a somewhat different definition of winsorization with no sorting in this code copied from a Patrick Burns posting from earlier this year on R-SIG-Finance;

function(x, winsorize=5) {
           s <- mad(x) * winsorize
           top <- median(x) + s
           bot <- median(x) - s
           x[x > top] <- top
           x[x < bot] <- bot x }

--
David Winsemius
On Jan 16, 2009, at 3:50 PM, Karl Healey wrote:

Hi All,

I want to take a matrix (or data frame) and winsorize each variable. So I can, for example, correlate the winsorized variables.

The code below will winsorize a single vector, but when applied to several vectors, each ends up sorted independently in ascending order so that a given observation is no longer on the same row for each vector.

So I need to winsorize the variable but then return it to its original order. Or another solution that will take a data frame, wisorize each variable, and return a new data frame with all the variables in the original order.

Thanks for any help!

-Karl


#The function I'm working from

win<-function(x,tr=.2,na.rm=F){

  if(na.rm)x<-x[!is.na(x)]
  y<-sort(x)
  n<-length(x)
  ibot<-floor(tr*n)+1
  itop<-length(x)-ibot+1
  xbot<-y[ibot]
  xtop<-y[itop]
  y<-ifelse(y<=xbot,xbot,y)
  y<-ifelse(y>=xtop,xtop,y)
  win<-y
  win
}

#Produces an example data frame, ss is the observation id, vars 1-5 are the variables I want to winzorise.

ss = c (1 : 5 );var1 = rnorm (5 );var2 = rnorm (5 );var3 =rnorm(5);var4=rnorm(5);as.data.frame(cbind(ss,var1,var2,var3,var4))- >data
data

#Winsorizes each variable, but sorts them independently so the observations no longer line up.

sapply(data,win)


___________________________
M. Karl Healey
Ph.D. Student

Department of Psychology
University of Toronto
Sidney Smith Hall
100 St. George Street
Toronto, ON
M5S 3G3

k...@psych.utoronto.ca

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to