[R] problem with lm and predict - no predictions made

Keld Jørn Simonsen keld at dkuug.dk
Thu Jul 3 20:58:58 CEST 2008


On Thu, Jul 03, 2008 at 12:47:26PM +0200, ONKELINX, Thierry wrote:
> As long as you don't supply future values of usa.p to predict, it can't do any predictions at all. Also note that lm probably doesn't takes the timeseries info (the time part of it) into account. You're just regressing usa on usa.p. gls() is probably a better approach
> 
> gls(log(usa) ~ log(usa.p) + year, correlation = corAR1(form = ~year))

Thanks to all for your comments. What I found out worked for me was
arima:

usa.m = arima(log(usa.p),order=c(1,1,2))
usa.lp = predict(usa.ml, 10)$pred 

To predict the next 10 years of population growth in the USA.

best regards
keld

> Thierry
> 
> ----------------------------------------------------------------------------
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium 
> tel. + 32 54/436 185
> Thierry.Onkelinx at inbo.be 
> www.inbo.be 
> 
> To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
> 
> The plural of anecdote is not data.
> ~ Roger Brinner
> 
> The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
> 
> -----Oorspronkelijk bericht-----
> Van: Keld Jørn Simonsen [mailto:keld at dkuug.dk] 
> Verzonden: donderdag 3 juli 2008 12:28
> Aan: ONKELINX, Thierry
> Onderwerp: Re: [R] problem with lm and predict - no predictions made
> 
> On Thu, Jul 03, 2008 at 10:21:27AM +0200, ONKELINX, Thierry wrote:
> > You'll need to specify newdata which is a dataframe with the information on the independent variables at the locations where yuou want predictions. In your case it will be a dataframe with information on us.p.
> 
> I did so, and I also tried a number of other things, such as having the
> data be in time series, and also having various valies for na.action -
> na.exclude and such. still predict.lm would only give me predictions for
> the first 28 observations, while I wanted also predictions for the
> future values. 
> 
> what I did was then with time series, with NA for values I wanted
> predicted:
> 
>  usa
> Time Series:
> Start = 1980 
> End = 2030 
> Frequency = 1 
>  [1]  2789.53  3128.43  3255.03  3536.68  3933.18  4220.25  4462.83 4739.48
>  [9]  5103.75  5484.35  5803.08  5995.93  6337.75  6657.40  7072.23 7397.65
> [17]  7816.83  8304.33  8746.98  9268.43  9816.98 10127.95 10469.60 10960.75
> [25] 11685.93 12433.93 13194.70 13843.83       NA       NA       NA NA
> [33]       NA       NA       NA       NA       NA       NA       NA NA
> [41]       NA       NA       NA       NA       NA       NA       NA NA
> [49]       NA       NA       NA
> 
>  usa.p
> Time Series:
> Start = 1980 
> End = 2030 
> Frequency = 1 
>  [1] 227.62 229.92 232.13 234.25 236.31 238.42 240.59 242.75 244.97 247.29
> [11] 250.05 253.39 256.78 260.15 263.33 266.46 269.58 272.82 276.02 279.20
> [21] 282.31 285.25 288.10 290.85 293.53 296.26 299.08 301.97     NA NA
> [31]     NA     NA     NA     NA     NA     NA     NA     NA     NA NA
> [41]     NA     NA     NA     NA     NA     NA     NA     NA     NA NA
> [51]     NA
> 
> 
> usa.d = data.frame(usa.p)
> 
> usa.m = lm(usa ~ usa.p)
>  predict(usa.m,usa.d,n.count=5)
> 
> Which only gave predictions for the first 28 observations 
> 
> I wanted it to also give predictons for the next 5.
> 
> best regards
> keld
> 
> > HTH,
> > 
> > Thierry
> > 
> > 
> > ----------------------------------------------------------------------------
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
> > Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance
> > Gaverstraat 4
> > 9500 Geraardsbergen
> > Belgium 
> > tel. + 32 54/436 185
> > Thierry.Onkelinx at inbo.be 
> > www.inbo.be 
> > 
> > To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
> > ~ Sir Ronald Aylmer Fisher
> > 
> > The plural of anecdote is not data.
> > ~ Roger Brinner
> > 
> > The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
> > ~ John Tukey
> > 
> > -----Oorspronkelijk bericht-----
> > Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens Keld Jørn Simonsen
> > Verzonden: donderdag 3 juli 2008 9:41
> > Aan: Erik Iverson
> > CC: r-help at r-project.org
> > Onderwerp: Re: [R] problem with lm and predict - no predictions made
> > 
> > On Wed, Jul 02, 2008 at 08:55:54PM -0500, Erik Iverson wrote:
> > > Hello -
> > > 
> > > Keld Jørn Simonsen wrote:
> > > >Hi 
> > > >
> > > >I have a problem with lm and predict
> > > >
> > > >I have 
> > > >
> > > >us
> > > > [1]  2789.53  3128.43  3255.03  3536.68  3933.18  4220.25  4462.83 4739.48
> > > > [9]  5103.75  5484.35  5803.08  5995.93  6337.75  6657.40  7072.23 7397.65
> > > >[17]  7816.83  8304.33  8746.98  9268.43  9816.98 10127.95 10469.60 
> > > >10960.75
> > > >[25] 11685.93 12433.93 13194.70 13843.83
> > > >
> > > >
> > > > us.p
> > > > [1] 227.62 229.92 232.13 234.25 236.31 238.42 240.59 242.75 244.97 247.29
> > > >[11] 250.05 253.39 256.78 260.15 263.33 266.46 269.58 272.82 276.02 279.20
> > > >[21] 282.31 285.25 288.10 290.85 293.53 296.26 299.08 301.97
> > > >
> > > > us.l = lm(log(us) ~ log(us.p))
> > > >>predict(us.l,n.ahead=5)
> > > >       1        2        3        4        5        6        7        8 
> > > >8.079754 8.131908 8.181531 8.228692 8.274111 8.320224 8.367224 8.413588 
> > > >       9       10       11       12       13       14       15       16 
> > > >8.460813 8.509709 8.567285 8.636117 8.705057 8.772694 8.835719 8.897015 
> > > >      17       18       19       20       21       22       23       24 
> > > >8.957402 9.019376 9.079867 9.139289 9.196752 9.250495 9.302067 9.351347 
> > > >      25       26       27       28 
> > > >9.398927 9.446950 9.496094 9.545979
> > > >
> > > >
> > > >Why does predict not give me any predictions? The result of predict() is
> > > >same lenght (28) as the us and us.p variables. 
> > > 
> > > The version of 'predict' being called on 'us.l' (i.e., predict.lm) is 
> > > doing predictions, and it should be giving you a result of identical 
> > > length as your original vectors.  What are you expecting here?  Your 
> > > usage of the 'n.ahead' parameter suggests to me you might be wanting to 
> > > fit your model using a different function than 'lm', and use its 
> > > corresponding prediction function.
> > 
> > Yes, what I have is actually some time series since 1980 on various
> > countries - here USA production and population. I would like to estimate
> > a model and then extrapolate for future years. Maybe predict.lm() is not
> > the right function for that. What would be an adequate function then?
> > 
> > best regards
> > keld
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list