[R] tagging results of "apply"

Sun Jul 22 11:40:26 CEST 2007

Dear Bruce,
In your functions, you need to use your bound variable, 'x' [not mat1] in
your anonymous function [function(x)] as the argument to cor().

For instance, you wrote:
apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
apply(mat1, 1, function(x) cor(mat1, mat2))

They should be
apply(mat1, 1, function(x) cor(x, mat2[1,]))
apply(mat1, 1, function(x) cor(x, mat2))

or
f <- function(x,y) cor(x, y)
apply(mat1, 1, f, y=mat2[1,])
apply(mat1, 1, f, y=mat2)

Then from the ?apply documentation - under section, 'Value' - the following
statement will help you predict its behavior in this case:
"If each call to FUN returns a vector of length n, then apply returns an
array of dimension c(n, dim(X)[MARGIN]) if n > 1."

[each column of your output is the output from cor(mat1[i,],mat2) in Scenario
2]. As for tagging, you can try adding dimension labels [to the object which
is passed as the 'X' argument to apply()]:

mat1 <- matrix(sample(1:500, 25), ncol = 5,
               dimnames=list(paste("row",1:5,sep=""),
                 paste("col",1:5,sep="")))
mat2 <- matrix(sample(501:1000, 25), ncol = 5)

> apply(mat1, 1, function(x,y) cor(x, y), y=mat2)
            row1       row2       row3        row4        row5
[1,]  0.39412464 -0.6241649  0.7423724  0.48391875  0.27085386
[2,] -0.22912466 -0.4123714  0.2857004 -0.52447327  0.06971423
[3,] -0.51027247  0.3256587 -0.6195050 -0.48309737  0.01699978
[4,]  0.26353316 -0.1873564  0.2121154  0.88784766 -0.02257890
[5,] -0.03771225 -0.4250040  0.3795558 -0.03372794 -0.05874675

Hope this helps,

Stephen

--- "Bernzweig, Bruce (Consultant)" <bbernzwe at bear.com> wrote:

> In trying to get a better understanding of vectorization I wrote the
> following code:
> 
> My objective is to take two sets of time series and calculate the
> correlations for each combination of time series.
> 
> mat1 <- matrix(sample(1:500, 25), ncol = 5)
> mat2 <- matrix(sample(501:1000, 25), ncol = 5)
> 
> Scenario 1:
> apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
> 
> Scenario 2:
> apply(mat1, 1, function(x) cor(mat1, mat2))
> 
> Using scenario 1, (output below) I can see that correlations are
> calculated for just the first row of mat2 against each individual row of
> mat1.
> 
> Using scenario 2, (output below) I can see that correlations are
> calculated for each row of mat2 against each individual row of mat1.  
> 
> Q1: The output of scenario2 consists of 25 rows of data.  Are the first
> five rows mat1 against mat2[1,], the next five rows mat1 against
> mat2[2,], ... last five rows mat1 against mat2[5,]?
> 
> Q2: I assign the output of scenario 2 to a new matrix
> 
> 	matC <- apply(mat1, 1, function(x) cor(mat1, mat2))
> 
>     However, I need a way to identify each row in matC as a pairing of
> rows from mat1 and mat2.  Is there a parameter I can add to apply to do
> this?
> 
> Scenario 1:
> > apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
>            [,1]       [,2]       [,3]       [,4]       [,5]
> [1,] -0.4626122 -0.4626122 -0.4626122 -0.4626122 -0.4626122
> [2,] -0.9031543 -0.9031543 -0.9031543 -0.9031543 -0.9031543
> [3,]  0.0735273  0.0735273  0.0735273  0.0735273  0.0735273
> [4,]  0.7401259  0.7401259  0.7401259  0.7401259  0.7401259
> [5,] -0.4548582 -0.4548582 -0.4548582 -0.4548582 -0.4548582
> 
> Scenario 2:
> > apply(mat1, 1, function(x) cor(mat1, mat2))
>              [,1]        [,2]        [,3]        [,4]        [,5]
>  [1,]  0.19394126  0.19394126  0.19394126  0.19394126  0.19394126
>  [2,]  0.26402400  0.26402400  0.26402400  0.26402400  0.26402400
>  [3,]  0.12923842  0.12923842  0.12923842  0.12923842  0.12923842
>  [4,] -0.74549676 -0.74549676 -0.74549676 -0.74549676 -0.74549676
>  [5,]  0.64074122  0.64074122  0.64074122  0.64074122  0.64074122
>  [6,]  0.26931986  0.26931986  0.26931986  0.26931986  0.26931986
>  [7,]  0.08527921  0.08527921  0.08527921  0.08527921  0.08527921
>  [8,] -0.28034079 -0.28034079 -0.28034079 -0.28034079 -0.28034079
>  [9,] -0.15251915 -0.15251915 -0.15251915 -0.15251915 -0.15251915
> [10,]  0.19542415  0.19542415  0.19542415  0.19542415  0.19542415
> [11,]  0.75107032  0.75107032  0.75107032  0.75107032  0.75107032
> [12,]  0.53042767  0.53042767  0.53042767  0.53042767  0.53042767
> [13,] -0.51163612 -0.51163612 -0.51163612 -0.51163612 -0.51163612
> [14,] -0.44396048 -0.44396048 -0.44396048 -0.44396048 -0.44396048
> [15,]  0.57018745  0.57018745  0.57018745  0.57018745  0.57018745
> [16,]  0.70480284  0.70480284  0.70480284  0.70480284  0.70480284
> [17,] -0.36674283 -0.36674283 -0.36674283 -0.36674283 -0.36674283
> [18,] -0.81826607 -0.81826607 -0.81826607 -0.81826607 -0.81826607
> [19,]  0.53145184  0.53145184  0.53145184  0.53145184  0.53145184
> [20,]  0.24568385  0.24568385  0.24568385  0.24568385  0.24568385
> [21,] -0.10610402 -0.10610402 -0.10610402 -0.10610402 -0.10610402
> [22,] -0.78650748 -0.78650748 -0.78650748 -0.78650748 -0.78650748
> [23,]  0.04269423  0.04269423  0.04269423  0.04269423  0.04269423
> [24,]  0.14704698  0.14704698  0.14704698  0.14704698  0.14704698
> [25,]  0.28340166  0.28340166  0.28340166  0.28340166  0.28340166
> 
> 
> 
> **********************************************************************
> Please be aware that, notwithstanding the fact that the pers...{{dropped}}
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>