[R] tagging results of "apply"
Stephen Tucker
brown_emu at yahoo.com
Sun Jul 22 12:08:21 CEST 2007
Actually if you want to tag both column and row, this might also help:
## Give dimension labels to both matrices
mat1 <- matrix(sample(1:500, 25), ncol = 5,
dimnames=list(paste("mat1row",1:5,sep=""),
paste("mat1col",1:5,sep="")))
mat2 <- matrix(sample(501:1000, 25), ncol = 5,
dimnames=list(paste("mat2row",1:5,sep=""),
paste("mat2col",1:5,sep="")))
cor(mat1[1,],mat2)
mat2col1 mat2col2 mat2col3 mat2col4 mat2col5
[1,] -0.06313535 -0.4679927 -0.5147084 -0.797748 -0.001457972
The column labels are there but are lost when returned from apply(), as it
says in ?apply:
"In all cases the result is coerced by as.vector to one of the basic vector
types before the dimensions are set"
> as.vector(cor(mat1[1,],mat2))
[1] -0.063135353 -0.467992672 -0.514708392 -0.797748010 -0.001457972
You lose the dimension labels in this case, so one option is to guard against
this in the following way:
> as.vector(as.data.frame(cor(mat1[1,],mat2)))
mat2col1 mat2col2 mat2col3 mat2col4 mat2col5
1 -0.06313535 -0.4679927 -0.5147084 -0.797748 -0.001457972
Unfortunately, if you use 'as.data.frame()' in 'function(x)', apply will
return a list - but you can bind the rows of the output:
> f <- function(x,y) as.data.frame(cor(x,y))
> do.call(rbind, apply(mat1,1,f,y=mat2))
mat2col1 mat2col2 mat2col3 mat2col4 mat2col5
mat1row1 -0.06313535 -0.4679927 -0.51470839 -0.7977480 -0.001457972
mat1row2 -0.28750363 0.1681777 0.14671484 0.8139768 0.039982028
mat1row3 -0.62017387 -0.6932731 -0.72263865 -0.7929604 0.427366680
mat1row4 0.06441894 0.1707946 -0.11444747 -0.8213577 0.526239013
mat1row5 -0.09849051 0.7024540 -0.01997228 0.3712480 0.439037838
The result is a data frame, not a matrix, and note that the columns/rows are
transposed in relation to the output of
apply(mat1,1,f,y=mat2)
An alternative is to convert each row of mat1 into a list element [by
transposing it with t() and then feeding it to as.data.frame()] and then use
sapply():
> sapply(as.data.frame(t(mat1)),f,y=mat2)
mat1row1 mat1row2 mat1row3 mat1row4 mat1row5
mat2col1 -0.06313535 -0.2875036 -0.6201739 0.06441894 -0.0984905
mat2col2 -0.4679927 0.1681777 -0.6932731 0.1707946 0.702454
mat2col3 -0.5147084 0.1467148 -0.7226387 -0.1144475 -0.01997228
mat2col4 -0.797748 0.8139768 -0.7929604 -0.8213577 0.371248
mat2col5 -0.001457972 0.03998203 0.4273667 0.526239 0.4390378
--- Stephen Tucker <brown_emu at yahoo.com> wrote:
> Dear Bruce,
> In your functions, you need to use your bound variable, 'x' [not mat1] in
> your anonymous function [function(x)] as the argument to cor().
>
> For instance, you wrote:
> apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
> apply(mat1, 1, function(x) cor(mat1, mat2))
>
> They should be
> apply(mat1, 1, function(x) cor(x, mat2[1,]))
> apply(mat1, 1, function(x) cor(x, mat2))
>
> or
> f <- function(x,y) cor(x, y)
> apply(mat1, 1, f, y=mat2[1,])
> apply(mat1, 1, f, y=mat2)
>
> Then from the ?apply documentation - under section, 'Value' - the following
> statement will help you predict its behavior in this case:
> "If each call to FUN returns a vector of length n, then apply returns an
> array of dimension c(n, dim(X)[MARGIN]) if n > 1."
>
> [each column of your output is the output from cor(mat1[i,],mat2) in
> Scenario
> 2]. As for tagging, you can try adding dimension labels [to the object
> which
> is passed as the 'X' argument to apply()]:
>
> mat1 <- matrix(sample(1:500, 25), ncol = 5,
> dimnames=list(paste("row",1:5,sep=""),
> paste("col",1:5,sep="")))
> mat2 <- matrix(sample(501:1000, 25), ncol = 5)
>
> > apply(mat1, 1, function(x,y) cor(x, y), y=mat2)
> row1 row2 row3 row4 row5
> [1,] 0.39412464 -0.6241649 0.7423724 0.48391875 0.27085386
> [2,] -0.22912466 -0.4123714 0.2857004 -0.52447327 0.06971423
> [3,] -0.51027247 0.3256587 -0.6195050 -0.48309737 0.01699978
> [4,] 0.26353316 -0.1873564 0.2121154 0.88784766 -0.02257890
> [5,] -0.03771225 -0.4250040 0.3795558 -0.03372794 -0.05874675
>
> Hope this helps,
>
> Stephen
>
> --- "Bernzweig, Bruce (Consultant)" <bbernzwe at bear.com> wrote:
>
> > In trying to get a better understanding of vectorization I wrote the
> > following code:
> >
> > My objective is to take two sets of time series and calculate the
> > correlations for each combination of time series.
> >
> > mat1 <- matrix(sample(1:500, 25), ncol = 5)
> > mat2 <- matrix(sample(501:1000, 25), ncol = 5)
> >
> > Scenario 1:
> > apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
> >
> > Scenario 2:
> > apply(mat1, 1, function(x) cor(mat1, mat2))
> >
> > Using scenario 1, (output below) I can see that correlations are
> > calculated for just the first row of mat2 against each individual row of
> > mat1.
> >
> > Using scenario 2, (output below) I can see that correlations are
> > calculated for each row of mat2 against each individual row of mat1.
> >
> > Q1: The output of scenario2 consists of 25 rows of data. Are the first
> > five rows mat1 against mat2[1,], the next five rows mat1 against
> > mat2[2,], ... last five rows mat1 against mat2[5,]?
> >
> > Q2: I assign the output of scenario 2 to a new matrix
> >
> > matC <- apply(mat1, 1, function(x) cor(mat1, mat2))
> >
> > However, I need a way to identify each row in matC as a pairing of
> > rows from mat1 and mat2. Is there a parameter I can add to apply to do
> > this?
> >
> > Scenario 1:
> > > apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
> > [,1] [,2] [,3] [,4] [,5]
> > [1,] -0.4626122 -0.4626122 -0.4626122 -0.4626122 -0.4626122
> > [2,] -0.9031543 -0.9031543 -0.9031543 -0.9031543 -0.9031543
> > [3,] 0.0735273 0.0735273 0.0735273 0.0735273 0.0735273
> > [4,] 0.7401259 0.7401259 0.7401259 0.7401259 0.7401259
> > [5,] -0.4548582 -0.4548582 -0.4548582 -0.4548582 -0.4548582
> >
> > Scenario 2:
> > > apply(mat1, 1, function(x) cor(mat1, mat2))
> > [,1] [,2] [,3] [,4] [,5]
> > [1,] 0.19394126 0.19394126 0.19394126 0.19394126 0.19394126
> > [2,] 0.26402400 0.26402400 0.26402400 0.26402400 0.26402400
> > [3,] 0.12923842 0.12923842 0.12923842 0.12923842 0.12923842
> > [4,] -0.74549676 -0.74549676 -0.74549676 -0.74549676 -0.74549676
> > [5,] 0.64074122 0.64074122 0.64074122 0.64074122 0.64074122
> > [6,] 0.26931986 0.26931986 0.26931986 0.26931986 0.26931986
> > [7,] 0.08527921 0.08527921 0.08527921 0.08527921 0.08527921
> > [8,] -0.28034079 -0.28034079 -0.28034079 -0.28034079 -0.28034079
> > [9,] -0.15251915 -0.15251915 -0.15251915 -0.15251915 -0.15251915
> > [10,] 0.19542415 0.19542415 0.19542415 0.19542415 0.19542415
> > [11,] 0.75107032 0.75107032 0.75107032 0.75107032 0.75107032
> > [12,] 0.53042767 0.53042767 0.53042767 0.53042767 0.53042767
> > [13,] -0.51163612 -0.51163612 -0.51163612 -0.51163612 -0.51163612
> > [14,] -0.44396048 -0.44396048 -0.44396048 -0.44396048 -0.44396048
> > [15,] 0.57018745 0.57018745 0.57018745 0.57018745 0.57018745
> > [16,] 0.70480284 0.70480284 0.70480284 0.70480284 0.70480284
> > [17,] -0.36674283 -0.36674283 -0.36674283 -0.36674283 -0.36674283
> > [18,] -0.81826607 -0.81826607 -0.81826607 -0.81826607 -0.81826607
> > [19,] 0.53145184 0.53145184 0.53145184 0.53145184 0.53145184
> > [20,] 0.24568385 0.24568385 0.24568385 0.24568385 0.24568385
> > [21,] -0.10610402 -0.10610402 -0.10610402 -0.10610402 -0.10610402
> > [22,] -0.78650748 -0.78650748 -0.78650748 -0.78650748 -0.78650748
> > [23,] 0.04269423 0.04269423 0.04269423 0.04269423 0.04269423
> > [24,] 0.14704698 0.14704698 0.14704698 0.14704698 0.14704698
> > [25,] 0.28340166 0.28340166 0.28340166 0.28340166 0.28340166
> >
> >
> >
> > **********************************************************************
> > Please be aware that, notwithstanding the fact that the
> pers...{{dropped}}
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list