[R] applying lm on an array of observations with common design matrix
Ranjan Maitra
maitra at iastate.edu
Thu Feb 22 20:21:32 CET 2007
On Thu, 22 Feb 2007 08:17:38 +0000 (GMT) Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> On Thu, 22 Feb 2007, Petr Klasterecky wrote:
>
> > Ranjan Maitra napsal(a):
> >> On Sun, 18 Feb 2007 07:46:56 +0000 (GMT) Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> >>
> >>> On Sat, 17 Feb 2007, Ranjan Maitra wrote:
> >>>
> >>>> Dear list,
> >>>>
> >>>> I have a 4-dimensional array Y of dimension 330 x 67 x 35 x 51. I have a
> >>>> design matrix X of dimension 330 x 4. I want to fit a linear regression
> >>>> of each
> >>>>
> >>>> lm( Y[, i, j, k] ~ X). for each i, j, k.
> >>>>
> >>>> Can I do it in one shot without a loop?
> >>> Yes.
> >>>
> >>> YY <- YY
> >>> dim(YY) <- c(330, 67*35*51)
> >>> fit <- lm(YY ~ X)
> >>>
> >>>> Actually, I am also interested in getting the p-values of some of the
> >>>> coefficients -- lets say the coefficient corresponding to the second
> >>>> column of the design matrix. Can the same be done using array-based
> >>>> operations?
> >>> Use lapply(summary(fit), function(x) coef(x)[3,4]) (since there is a
> >>> intercept, you want the third coefficient).
> >>
> >> In this context, can one also get the variance-covariance matrix of the
> >> coefficients?
> >
> > Sure:
> >
> > lapply(summary(fit), function(x) {"$"(x,cov.unscaled)})
>
> But that is not the variance-covariance matrix (and it is an unusual way
> to write x$cov.unscaled)!
>
> > Add indexing if you do not want the whole matrix. You can extract
> > whatever you want, just take a look at ?summary.lm, section Value.
>
> It is unclear to me what the questioner expects: the estimated
> coefficients for different responses are independent. For a list of
> matrices applying to each response one could mimic vcov.lm and do
>
> lapply(summary(fit, corr=FALSE),
> function(so) so$sigma^2 * so$cov.unscaled)
Thanks! Actually, I am really looking to compare the coefficients (let us say second and the third) beta2 - beta4 = 0 for each regression. Basically, get the two-sided p-value for the test statistic for each regression.
One way of doing that is to get the dispersion matrix of each regression and then to compute the t-statistic and the p-value. That is the genesis of the question above. Is there a better way?
Many thanks and best wishes,
Ranjan
More information about the R-help
mailing list