On Jan 27, 2011, at 7:16 PM, Ernest Adrogué i Calveras wrote:

Hi,

I have this data.frame with two variables in it,

z
 V1 V2
1 10  8
2 NA 18
3  9  7
4  3 NA
5 NA 10
6 11 12
7 13  9
8 12 11

and a vector of means,

means <- apply(z, 2, function (col) mean(na.omit(col)))
means
      V1        V2
9.666667 10.714286

Two methods:

A) use sweep  (which by default takes the difference)

> sweep(z, 2, means)
          V1         V2
1  0.3333333 -2.7142857
2         NA  7.2857143
3 -0.6666667 -3.7142857
4 -6.6666667         NA
5         NA -0.7142857
6  1.3333333  1.2857143
7  3.3333333 -1.7142857
8  2.3333333  0.2857143


B) use the scale function (whose "whole purpose in life" is to subtract the mean and possibly divide by the standard deviation which we suppressed in this case with the scale=FALSE argument)

> scale(z, scale=FALSE)
          V1         V2
1  0.3333333 -2.7142857
2         NA  7.2857143
3 -0.6666667 -3.7142857
4 -6.6666667         NA
5         NA -0.7142857
6  1.3333333  1.2857143
7  3.3333333 -1.7142857
8  2.3333333  0.2857143
attr(,"scaled:center")
       V1        V2
 9.666667 10.714286

--
David.


My intention was substracting means from z, so instictively I tried

z-means
         V1         V2
1  0.3333333 -1.6666667
2         NA  7.2857143
3 -0.6666667 -2.6666667
4 -7.7142857         NA
5         NA  0.3333333
6  0.2857143  1.2857143
7  3.3333333 -0.6666667
8  1.2857143  0.2857143

But this is completely wrong. sapply() gives the same result:

sapply(z, function(row) row - means)
            V1         V2
[1,]  0.3333333 -1.6666667
[2,]         NA  7.2857143
[3,] -0.6666667 -2.6666667
[4,] -7.7142857         NA
[5,]         NA  0.3333333
[6,]  0.2857143  1.2857143
[7,]  3.3333333 -0.6666667
[8,]  1.2857143  0.2857143

So, what is going on here?
The following appears to work

z-matrix(means,ncol=2)[rep(1, dim(z)[1]),]
         V1         V2
1  0.3333333 -2.7142857
2         NA  7.2857143
3 -0.6666667 -3.7142857
4 -6.6666667         NA
5         NA -0.7142857
6  1.3333333  1.2857143
7  3.3333333 -1.7142857
8  2.3333333  0.2857143

but I think it's rather cumbersome, surely there must be a cleaner way
to do it.

--
Ernest

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to