[R] Adding a subset to a glm messes up factors?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Dec 7 15:03:28 CET 2007
First, 'subset' is an argument to glm(), but for some reason you did not
use it. Your subject line is quite misleading, and had it been the more
accurate
Adding a 'data' argument to glm messes up factors?
you might have realised the problem.
Second, your models are fitted to different datasets: the first to objects
in your workspace, and the second to columns of data.all. Since you have
not (as we asked) given a reproducible example we cannot know what those
differences are, but differences in the datasets will be the key.
Third, the best way to fit linear models is lm(), not
glm(family=gaussian).
On Fri, 7 Dec 2007, Muri Soares wrote:
> I have a problem with running a glm using a subset of my data. Whenever
> I choose a subset, in the summary the factors arent shown (as if the
> variable was a continuous variable). If I dont use subsets then all the
> factors are shown. I have copied the output from summary for both cases.
>
> Thanks for the help,
> Muri
>
>> model<-glm(log(cpue)~year,family=gaussian)
> Call:
> glm(formula = log(cpue) ~ year, family = gaussian)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -2.0962 -0.5851 -0.1241 0.4805 3.9236
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 0.8899 0.1844 4.825 1.42e-06 ***
> year1990 -0.6107 0.1925 -3.173 0.00152 **
> year1991 -1.7466 0.1902 -9.184 < 2e-16 ***
> year1992 -1.4061 0.1864 -7.544 5.07e-14 ***
> year1993 -1.4069 0.1860 -7.565 4.31e-14 ***
> ...
>
>> model<-glm(log(cpue)~year,family=gaussian,subset(data.all,species=="n")
> Call:
> glm(formula = log(cpue) ~ year, family = gaussian, data = subset(data.all,
> species == "n"))
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -1.64577 -0.61671 -0.08972 0.55792 2.73737
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 32.446570 10.076895 3.220 0.00135 **
> year -0.016345 0.005037 -3.245 0.00123 **
> ---
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
PLEASE do!
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list