[R] sapply puzzlement
David Winsemius
dwinsemius at comcast.net
Fri Jan 28 02:05:17 CET 2011
On Jan 27, 2011, at 7:16 PM, Ernest Adrogué i Calveras wrote:
> Hi,
>
> I have this data.frame with two variables in it,
>
>> z
> V1 V2
> 1 10 8
> 2 NA 18
> 3 9 7
> 4 3 NA
> 5 NA 10
> 6 11 12
> 7 13 9
> 8 12 11
>
> and a vector of means,
>
>> means <- apply(z, 2, function (col) mean(na.omit(col)))
>> means
> V1 V2
> 9.666667 10.714286
Two methods:
A) use sweep (which by default takes the difference)
> sweep(z, 2, means)
V1 V2
1 0.3333333 -2.7142857
2 NA 7.2857143
3 -0.6666667 -3.7142857
4 -6.6666667 NA
5 NA -0.7142857
6 1.3333333 1.2857143
7 3.3333333 -1.7142857
8 2.3333333 0.2857143
B) use the scale function (whose "whole purpose in life" is to
subtract the mean and possibly divide by the standard deviation which
we suppressed in this case with the scale=FALSE argument)
> scale(z, scale=FALSE)
V1 V2
1 0.3333333 -2.7142857
2 NA 7.2857143
3 -0.6666667 -3.7142857
4 -6.6666667 NA
5 NA -0.7142857
6 1.3333333 1.2857143
7 3.3333333 -1.7142857
8 2.3333333 0.2857143
attr(,"scaled:center")
V1 V2
9.666667 10.714286
--
David.
>
> My intention was substracting means from z, so instictively I tried
>
>> z-means
> V1 V2
> 1 0.3333333 -1.6666667
> 2 NA 7.2857143
> 3 -0.6666667 -2.6666667
> 4 -7.7142857 NA
> 5 NA 0.3333333
> 6 0.2857143 1.2857143
> 7 3.3333333 -0.6666667
> 8 1.2857143 0.2857143
>
> But this is completely wrong. sapply() gives the same result:
>
>> sapply(z, function(row) row - means)
> V1 V2
> [1,] 0.3333333 -1.6666667
> [2,] NA 7.2857143
> [3,] -0.6666667 -2.6666667
> [4,] -7.7142857 NA
> [5,] NA 0.3333333
> [6,] 0.2857143 1.2857143
> [7,] 3.3333333 -0.6666667
> [8,] 1.2857143 0.2857143
>
> So, what is going on here?
> The following appears to work
>
>> z-matrix(means,ncol=2)[rep(1, dim(z)[1]),]
> V1 V2
> 1 0.3333333 -2.7142857
> 2 NA 7.2857143
> 3 -0.6666667 -3.7142857
> 4 -6.6666667 NA
> 5 NA -0.7142857
> 6 1.3333333 1.2857143
> 7 3.3333333 -1.7142857
> 8 2.3333333 0.2857143
>
> but I think it's rather cumbersome, surely there must be a cleaner way
> to do it.
>
> --
> Ernest
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list