On 29-07-2013, at 18:39, "iza.ch1" <iza....@op.pl> wrote: > Hi everyone > > I have a problem with replacing the NA values with the mean of the column > which contains them. If I replace Na with the means of the rest values in the > column, the mean of the whole column will be still the same as if I would > have omitted NA values. I have the following data > > de > [,1] [,2] [,3] > [1,] NA -0.26928087 -0.1192078 > [2,] NA 1.20925752 0.9325334 > [3,] NA 0.38012008 -1.8927164 > [4,] NA -0.41778861 1.4330507 > [5,] NA -0.49677462 0.2892706 > [6,] NA -0.13248754 1.3976522 > [7,] NA -0.54179054 0.2295291 > [8,] NA 0.35788624 -0.5009389 > [9,] 0.27500571 -0.41467591 -0.3426560 > [10,] -3.07568579 -0.59234248 -0.8439027 > [11,] -0.42240954 0.73642396 -0.4971999 > [12,] -0.26901731 -0.06768044 -1.6127122 > [13,] 0.01766284 -0.40321968 -0.6508823 > [14,] -0.80999580 -1.52283305 1.4729576 > [15,] 0.20805934 0.25974308 -1.6093478 > [16,] 0.03036708 -0.04013730 0.1686006 > > and I wrote the code > de[which(is.na(de))]<-sapply(seq_len(ncol(de)),function(i) > {mean(de[,i],na.rm=TRUE)}) > > I get as the result > [,1] [,2] [,3] > [1,] -0.50575168 -0.26928087 -0.1192078 > [2,] -0.12222376 1.20925752 0.9325334 > [3,] -0.13412312 0.38012008 -1.8927164 > [4,] -0.50575168 -0.41778861 1.4330507 > [5,] -0.12222376 -0.49677462 0.2892706 > [6,] -0.13412312 -0.13248754 1.3976522 > [7,] -0.50575168 -0.54179054 0.2295291 > [8,] -0.12222376 0.35788624 -0.5009389 > [9,] 0.27500571 -0.41467591 -0.3426560 > [10,] -3.07568579 -0.59234248 -0.8439027 > [11,] -0.42240954 0.73642396 -0.4971999 > [12,] -0.26901731 -0.06768044 -1.6127122 > [13,] 0.01766284 -0.40321968 -0.6508823 > [14,] -0.80999580 -1.52283305 1.4729576 > [15,] 0.20805934 0.25974308 -1.6093478 > [16,] 0.03036708 -0.04013730 0.1686006 > > It has replaced the NA values in first column with mean of first column > -0.505... and second cell with mean of second column etc. > I want to have the result like this: > [,1] [,2] [,3] > [1,] -0.50575168 -0.26928087 -0.1192078 > [2,] -0.50575168 1.20925752 0.9325334 > [3,] -0.50575168 0.38012008 -1.8927164 > [4,] -0.50575168 -0.41778861 1.4330507 > [5,] -0.50575168 -0.49677462 0.2892706 > [6,] -0.50575168 -0.13248754 1.3976522 > [7,] -0.50575168 -0.54179054 0.2295291 > [8,] -0.50575168 0.35788624 -0.5009389 > [9,] 0.27500571 -0.41467591 -0.3426560 > [10,] -3.07568579 -0.59234248 -0.8439027 > [11,] -0.42240954 0.73642396 -0.4971999 > [12,] -0.26901731 -0.06768044 -1.6127122 > [13,] 0.01766284 -0.40321968 -0.6508823 > [14,] -0.80999580 -1.52283305 1.4729576 > [15,] 0.20805934 0.25974308 -1.6093478 > [16,] 0.03036708 -0.04013730 0.1686006 >
or this: apply(de,2, function(x) {x[which(is.na(x))] <- mean(x,na.rm=TRUE);x}) Berend ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.