[R] functions on rows or columns of two (or more) arrays
Dennis Murphy
djmuser at gmail.com
Fri Aug 5 07:19:08 CEST 2011
Hi:
Here's one approach:
a=matrix(1:50,nrow=10)
a2=floor(jitter(a,amount=50))
# Write a function to combine the columns of interest
# into a data frame and fit a linear model
regfn <- function(k) {
rdf <- data.frame(x = a[k, ], y = a2[k, ])
lm(y ~ x, data = rdf)
}
# Use lapply() to run regfn() recursively along
# the rows of a and a2:
modlist <- lapply(seq_len(nrow(a)), regfn)
# I prefer plyr for extraction of output from a list of models.
# Here are a few examples:
library('plyr')
# Extract the R^2 values
ldply(modlist, function(m) summary(m)$r.squared)
# Extract the residuals
laply(modlist, function(m) resid(m))
# Extract the estimated model coefficients
ldply(modlist, function(m) coef(m))
# Extract the coefficient summary tables as a list
llply(modlist, function(m) summary(m)$coefficients)
In the anonymous functions, the argument m refers to an arbitrary lm
object, so you can do to it what you would with any given lm object;
all you're doing is abstracting the process.
HTH,
Dennis
On Thu, Aug 4, 2011 at 2:17 PM, Jim Bouldin <bouldinjr at gmail.com> wrote:
> I realize this should be simple, but even after reading over the several
> help pages several times, I still cannot decide between the myriad "apply"
> functions to address it. I simply want to apply a function to all the rows
> (or columns) of the same index from two (or more) identically sized arrays
> (or data frames).
>
> For example:
>
>> a=matrix(1:50,nrow=10)
>> a2=floor(jitter(a,amount=50))
>> a
> [,1] [,2] [,3] [,4] [,5]
> [1,] 1 11 21 31 41
> [2,] 2 12 22 32 42
> [3,] 3 13 23 33 43
> [4,] 4 14 24 34 44
> [5,] 5 15 25 35 45
> [6,] 6 16 26 36 46
> [7,] 7 17 27 37 47
> [8,] 8 18 28 38 48
> [9,] 9 19 29 39 49
> [10,] 10 20 30 40 50
>> a2
> [,1] [,2] [,3] [,4] [,5]
> [1,] 31 56 -29 -13 10
> [2,] 38 61 71 55 9
> [3,] -29 38 47 12 38
> [4,] 12 2 43 39 93
> [5,] -43 23 -23 62 1
> [6,] -13 61 55 11 2
> [7,] -42 1 38 12 8
> [8,] -13 -6 -18 16 95
> [9,] -19 -2 78 33 1
> [10,] 20 -16 -11 19 17
>
> if I try the following for example:
> apply(a,1,function(x) lm(a~a2))
>
> I get 10 identical repeats (except for the list indexer) of the following:
>
> [[1]]
>
> Call:
> lm(formula = a ~ a2)
>
> Coefficients:
> [,1] [,2] [,3] [,4] [,5]
> (Intercept) 8.372135 18.372135 28.372135 38.372135 48.372135
> a21 -0.006163 -0.006163 -0.006163 -0.006163 -0.006163
> a22 -0.093390 -0.093390 -0.093390 -0.093390 -0.093390
> a23 0.009315 0.009315 0.009315 0.009315 0.009315
> a24 -0.015143 -0.015143 -0.015143 -0.015143 -0.015143
> a25 -0.026761 -0.026761 -0.026761 -0.026761 -0.026761
>
> ...Which is clearly very wrong, in a number of ways. If I try by columns:
> apply(a,2,function(x) lm(a~a2))
> ...I get exactly the same result.
>
> So, which is the appropriate apply-type function when two arrays (or
> d.f.'s?) are involved like this? Or none of them and some other approach
> (other than looping which I can do but which I assume is not optimal)?
> Thanks for any help.
> --
> Jim Bouldin, PhD
> Research Ecologist
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list