[R] R/S-Plus equivalent to Genstat "predict"

Fri Oct 7 12:44:14 CEST 2005

As an alternative to the effects package, try predict() with  
type="terms"
JM

On 7 Oct 2005, at 8:00 PM, Peter Dunn wrote:

> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Peter Dunn
> Sent: Wednesday, October 05, 2005 9:06 PM
> To: R-help mailing list
> Subject: [R] R/S-Plus equivalent to Genstat "predict":
> predictions over "averages" of covariates
>
> Hi all
>
> I'm doing some things with a colleague comparing different
> sorts of models.  My colleague has fitted a number of glms in
> Genstat (which I have never used), while the glm I have been
> using is only available for R.
>
> He has a spreadsheet of fitted means from each of his models
> obtained from using the Genstat "predict" function.  For
> example, suppose we fit the model of the type
>     glm.out <- glm( y ~ factor(F1) + factor(F2) + X1 + poly(X2,2) +
>        poly(X3,2), family=...)
>
> Then he produces a table like this (made up, but similar):
>
> F1(level1)    12.2
> F1(level2)    14.2
> F1(level3)    15.3
> F2(level1)    10.3
> F2(level2)    9.1
> X1=0        10.2
> X1=0.5        10.4
> X1=1         10.4
> X1=1.5        10.5
> X1=2        10.9
> X1=2.5        11.9
> X1=3        11.8
> X2=0        12.0
> X2=0.5        12.2
> X2=1         12.5
> X2=1.5        12.9
> X2=2        13.0
> X2=2.5        13.1
> X2=3        13.5
>
> Each of the numbers are a predicted mean.  So when X1=0, on
> average we predict an outcome of 10.2.
>
> To obtain these figures in Genstat, he uses the Genstat "predict"
> function.  When I asked for an explanation of how it was done
> (ie to make the "predictions", what values of the other
> covariates were used) I was told:
>
>
>> So, for a one-dimensional table of fitted means for any factor (or
>> variate), all other variates are set to their average
>>
> values; and the
>
>> factor constants (including the first, at zero) are given a
>>
> weighted
>
>> average depending on their respective numbers of observations.
>>
>
> So for quantitative variables (such as pH), one uses the mean
> pH in the data set when making the predictions.  Reasonable anmd easy.
>
> But for categorical variables (like Month), he implies we use
> a weighted average of the fitted coefficients for all the
> months, depending on the proportion of times those factor
> levels appear in the data.
>
> (I hope I explained that OK...)
>
> Is there an equivalent way in R or S-Plus of doing this?  I
> have to do it for a number of sites and species, so an
> automated way would be useful.  I have tried searching to no
> avail (but may not be searching on the correct terms), and
> tried hard-coding something myself as yet unsuccessfully:
> The  poly  terms and the use of the weighted averaging over
> the factor levels are proving a bit too much for my limited skills.
>
> Any assistance appreciated.  (Any clarification of what I
> mean can be provided if I have not been clear.)
>
> Thanks, as always.
>
> P.