[R] how to get "lsmeans"?
Chuck Cleland
ccleland at optonline.net
Wed Mar 21 14:28:55 CET 2007
Liaw, Andy wrote:
> I verified the result from the following with output from JMP 6 on the
> same data (don't have SAS: don't need it):
>
> set.seed(631)
> n <- 100
> dat <- data.frame(y=rnorm(n), A=factor(sample(1:2, n, replace=TRUE)),
> B=factor(sample(1:2, n, replace=TRUE)),
> C=factor(sample(1:2, n, replace=TRUE)),
> d=rnorm(n))
> fm <- lm(y ~ A + B + C + d, dat)
> ## Form a data frame of points to predict: all combinations of the
> ## three factors and the mean of the covariate.
> p <- data.frame(expand.grid(A=1:2, B=1:2, C=1:2))
> p[] <- lapply(p, factor)
> p <- cbind(p, d=mean(dat$d))
> p <- cbind(yhat=predict(fm, p), p)
> ## lsmeans for the three factors:
> with(p, tapply(yhat, A, mean))
> with(p, tapply(yhat, B, mean))
> with(p, tapply(yhat, C, mean))
Using Andy's example data, these are the LSMEANS and intervals I get
from SAS:
A y LSMEAN 95% Confidence Limits
1 -0.071847 -0.387507 0.243813
2 -0.029621 -0.342358 0.283117
B y LSMEAN 95% Confidence Limits
1 -0.104859 -0.397935 0.188216
2 0.003391 -0.333476 0.340258
C y LSMEAN 95% Confidence Limits
1 -0.084679 -0.392343 0.222986
2 -0.016789 -0.336374 0.302795
One way of reproducing the LSMEANS and intervals from SAS using
predict() seems to be the following:
> dat.lm <- lm(y ~ A + as.numeric(B) + as.numeric(C) + d, data = dat)
> newdat <- expand.grid(A=factor(c(1,2)),B=1.5,C=1.5,d=mean(dat$d))
> cbind(newdat, predict(dat.lm, newdat, interval="confidence"))
A B C d fit lwr upr
1 1 1.5 1.5 0.09838595 -0.07184709 -0.3875070 0.2438128
2 2 1.5 1.5 0.09838595 -0.02962086 -0.3423582 0.2831165
However, another possibility seems to be:
> dat.lm <- lm(y ~ A + as.numeric(B) + as.numeric(C) + d, data = dat)
> newdat <-
expand.grid(A=factor(c(1,2)),B=mean(as.numeric(dat$B)),C=mean(as.numeric(dat$C)),d=mean(dat$d))
> cbind(newdat, predict(dat.lm, newdat, interval="confidence"))
A B C d fit lwr upr
1 1 1.43 1.48 0.09838595 -0.08078243 -0.3964661 0.2349012
2 2 1.43 1.48 0.09838595 -0.03855619 -0.3479589 0.2708465
The predictions directly above match what effect() in the effects
package by John Fox returns:
library(effects)
> effect("A", fm, xlevels=list(d = mean(dat$D)))
A effect
A
1 2
-0.08078243 -0.03855619
But for some reason the predict() and effect() intervals are a little
different:
> effect("A", fm, xlevels=list(d = mean(dat$D)))$lower
[,1]
101 -0.3924451
102 -0.3440179
> effect("A", fm, xlevels=list(d = mean(dat$D)))$upper
[,1]
101 0.2308802
102 0.2669055
I would be interested in any comments on these different approaches
and on the difference in intervals returned by predict() and effect().
hope this helps,
Chuck
> Andy
>
> From: Xingwang Ye
>> Dear all,
>>
>> I search the mail list about this topic and learn that no
>> simple way is available to get "lsmeans" in R as in SAS.
>> Dr.John Fox and Dr.Frank E Harrell have given very useful
>> information about "lsmeans" topic.
>> Dr. Frank E Harrell suggests not to think about lsmeans,
>> just to think about what predicted values wanted
>> and to use the predict function. However, after reading
>> the R help file for a whole day, I am still unclear how to do it.
>> Could some one give me a hand?
>>
>> for example:
>>
>> A,B and C are binomial variables(factors); d is a continuous
>> variable ; The response variable Y is a continuous variable too.
>>
>> To get lsmeans of Y according to A,B and C, respectively, in
>> SAS, I tried proc glm data=a; class A B C; model Y=A B C d;
>> lsmeans A B C/cl; run;
>>
>> In R, I tried this:
>> library(Design)
>> ddist<-datadist(a)
>> options(datadist="ddist")
>> f<-ols(Y~A+B+C+D,data=a,x=TRUE,y=TRUE,se.fit=TRUE)
>>
>> then how to get the "lsmeans" for A, B, and C, respectively
>> with predict function?
>>
>>
>>
>> Best wishes
>> yours, sincerely
>> Xingwang Ye
>> PhD candidate
>> Research Group of Nutrition Related Cancers and Other Chronic
>> Diseases
>> Institute for Nutritional Sciences,
>> Shanghai Institutes of Biological Sciences,
>> Chinese Academy of Sciences
>> P.O.Box 32
>> 294 Taiyuan Road
>> Shanghai 200031
>> P.R.CHINA
>>
>>
>
>
> ------------------------------------------------------------------------------
> Notice: This e-mail message, together with any attachments,...{{dropped}}
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894
More information about the R-help
mailing list