[R] Error of Stepwise Regression with number of rows in use has changed: remove missing values?
Greg Snow
Greg.Snow at imail.org
Fri Feb 19 21:57:29 CET 2010
Have you considered the implications of that solution?
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Kum-Hoe Hwang
> Sent: Wednesday, February 17, 2010 1:41 AM
> To: r-help at r-project.org
> Subject: Re: [R] Error of Stepwise Regression with number of rows in
> use has changed: remove missing values?
>
> I thank those who helped to solve a error in stepwise regression with
> missing values.
>
>
> Kum
>
> *
> *
>
> A good solution that I have tried was Andreas's advice.
>
> =====================================================================
>
> Try
>
> data<-na.omit(original database) before you run step() or stepAIC()
>
> On Tue, Feb 16, 2010 at 8:09 PM, Peter Ehlers <ehlers at ucalgary.ca>
> wrote:
>
> > On 2010-02-16 1:24, Kum-Hoe Hwang wrote:
> >
> >> Howdy, R Grues
> >>
> >> I have enjoyed R, but I cannot solve one problem easily. Please help
> my
> >> problem.
> >> When I tried the R script, I got the following Error. This error
> >> results from input data file exported through a Excel spreadsheet
> >> software.
> >>
> >> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> >> as.numeric(nation.grant) + :
> >> number of rows in use has changed: remove missing values?
> >>
> >> Could you direct me to solve the Error?
> >> Thanks in advance,
> >>
> >
> > This is a common situation when you use step() on data where
> > the predictors have missing values.
> >
> > A case (row) is included in the model only if all the
> > predictors for that model are non-missing for the case.
> >
> > As you vary which predictors are to be in the model, the
> > included cases will vary, resulting in models based on
> > different data. (Think of your cases as subjects; you want
> > all your models to be based on the same set of subjects.)
> >
> > Finally: (Re-)read the help page and note the 'warning'.
> >
> > -Peter Ehlers
> >
> >
> >
> >>
> >> ############### outputs from R console ###############
> >>> pop<- step(
> >>>
> >> + lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> >> as.numeric(nation.grant)
> >> + + as.numeric(do.grant) + as.numeric(city.grant) +
> >> as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data,
> >> na.action = na.omit)
> >> + )
> >> Start: AIC=494.27
> >> pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant)
> >> +
> >> as.numeric(do.grant) + as.numeric(city.grant) +
> as.numeric(DMZ.dist) +
> >> as.numeric(Seoul.dist)
> >> Df Sum of Sq RSS AIC
> >> - as.numeric(do.grant) 1 0.71 6622.9 492.28
> >> - as.factor(policy) 1 1.21 6623.4 492.29
> >> - as.numeric(DMZ.dist) 1 1.91 6624.1 492.30
> >> - as.numeric(city.grant) 1 5.07 6627.3 492.36
> >> - as.numeric(nation.grant) 1 11.51 6633.7 492.47
> >> - as.numeric(year) 1 29.58 6651.8 492.80
> >> <none> 6622.2 494.27
> >> - as.numeric(Seoul.dist) 1 673.22 7295.4 503.79
> >> Step: AIC=492.28
> >> pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant)
> >> +
> >> as.numeric(city.grant) + as.numeric(DMZ.dist) +
> as.numeric(Seoul.dist)
> >> Df Sum of Sq RSS AIC
> >> - as.factor(policy) 1 1.99 6624.9 490.32
> >> - as.numeric(DMZ.dist) 1 2.09 6625.0 490.32
> >> - as.numeric(city.grant) 1 7.18 6630.1 490.41
> >> - as.numeric(nation.grant) 1 20.08 6643.0 490.64
> >> - as.numeric(year) 1 28.89 6651.8 490.80
> >> <none> 6622.9 492.28
> >> - as.numeric(Seoul.dist) 1 697.46 7320.4 502.20
> >> Step: AIC=490.32
> >> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
> >> as.numeric(city.grant) +
> >> as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
> >> Df Sum of Sq RSS AIC
> >> - as.numeric(DMZ.dist) 1 2.08 6627.0 488.35
> >> - as.numeric(city.grant) 1 10.65 6635.6 488.51
> >> - as.numeric(nation.grant) 1 31.30 6656.2 488.88
> >> - as.numeric(year) 1 31.44 6656.4 488.88
> >> <none> 6624.9 490.32
> >> - as.numeric(Seoul.dist) 1 732.88 7357.8 500.80
> >> Step: AIC=488.35
> >> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
> >> as.numeric(city.grant) +
> >> as.numeric(Seoul.dist)
> >> Df Sum of Sq RSS AIC
> >> - as.numeric(city.grant) 1 9.86 6636.9 486.53
> >> - as.numeric(year) 1 31.42 6658.4 486.92
> >> - as.numeric(nation.grant) 1 33.33 6660.3 486.95
> >> <none> 6627.0 488.35
> >> - as.numeric(Seoul.dist) 1 754.40 7381.4 499.18
> >>
> >> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> >> as.numeric(nation.grant) + :
> >>
> >> --------------------------------------------------------------------
> -----------------------------------------------------------------------
> >> number of rows in use has changed: remove missing values?
> >>
> >> --------------------------------------------------------------------
> ----------------------
> >>
> >>
> >>
> >>
> >> --
> >> Kum-Hoe Hwang, Ph.D.
> >>
> >> Phone : 82-31-250-3516
> >> Email : phdhwang at gmail.com
> >>
> >>
> > --
> > Peter Ehlers
> > University of Calgary
> >
>
>
>
> --
> Kum-Hoe Hwang, Ph.D.
>
> Phone : 82-31-250-3516
> Email : phdhwang at gmail.com
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list