[R] Predict Function

Bill.Venables at csiro.au Bill.Venables at csiro.au
Sun Apr 13 02:21:50 CEST 2008


The problem comes from fitting the model using a formula like this:

mlogit <- vglm(bcsse$Active ~ bcsse$Impinteg + bcsse$Hsgradyr,
	family = multinomial(), na.action=na.pass)

If you write the formula in that form, prediction will be virtually
impossible, because it will be looking for variables with names
"bcsse$Imprinteg" but you will be supplying (I presume) variables witn
manes like "Imprinteg".

Using "$" in formulae makes no sense at all if you want to predict
with new data: none at all, nichts, nil, nada.  Do not do it.  I hope
that's clear.

You might be better off if you fit the model in the sensible form

mlogit <- vglm(Active ~ Impinteg + Hsgradyr,
		data = bcsse, family = multinomial(), 
		na.action = na.pass)

Then, at least, when it comes to prediction the predict function knows
the names of the variables it has to look for.

There is something really strange about your example below, though.
The variable Hsgradyr is constant, so it will be confounded with the
intercept term.  You still have issues to iron out here, but they are
statistical issues, not R issues.  Ask your tutor, I suggest.

Here is a worked example from the one presented (also rather poorly,
unfortunately - avoid using attach()!) in the tutorial you are
following:

> mydata <-
read.csv(url("http://www.ats.ucla.edu/stat/r/dae/mlogit.csv"))
> 
> library(VGAM)
> fm <- vglm(brand ~ female + age,
                family = multinomial(),
                data = mydata)
> coef(fm)
(Intercept):1 (Intercept):2      female:1      female:2
  22.72139607   10.94674104   -0.46594140    0.05787294

         age:1         age:2 
   -0.68590824   -0.31770176 

>   ### now for prediction
> newdata <- expand.grid(female = 0:1, age = 20:40)
> 
> newdata <- cbind(newdata, predict(fm, newdata, type = "response"))
> 
> names(newdata)
[1] "female" "age"    "1"      "2"      "3"     

> ### predictions will give a 3-column matrix of probs.

> ### we can compare this with the model fit using the 
> ### multinom() function in the nnet library but we 
> ### need first to adjust the response a little

> mydata <- transform(mydata,
        br = factor(brand, levels = c(3,1,2)))
>     
> library(nnet)
> gm <- multinom(br ~ female + age, mydata)
# weights:  12 (6 variable)
initial  value 807.480032 
iter  10 value 702.971567
final  value 702.970704 
converged
> coef(gm)
  (Intercept)      female        age
1    22.72150 -0.46576724 -0.6859142
2    10.94688  0.05798805 -0.3177081

> ### compare this with what we found earlier:
> matrix(coef(fm), nrow=2)
         [,1]        [,2]       [,3]
[1,] 22.72140 -0.46594140 -0.6859082
[2,] 10.94674  0.05787294 -0.3177018
> 
> ### this is as close as you would expect.



Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:                         +61 4 8819 4402
Home Phone:                     +61 7 3286 7700
mailto:Bill.Venables at csiro.au
http://www.cmis.csiro.au/bill.venables/ 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Biago
Sent: Sunday, 13 April 2008 4:06 AM
To: r-help at r-project.org
Subject: [R] Predict Function


Hi all - my first time here and am having an issue with the Predict
function.

I am using a tutorial as a guide, locate here:
http://www.ats.ucla.edu/STAT/R/dae/mlogit.htm

My code gives this error 

> newdata1$predicted <- predict(mlogit,newdata=newdata1,type="response")
Error in `$<-.data.frame`(`*tmp*`, "predicted", value =
c(0.332822934960197, 
: 
  replacement has 810 rows, data has 6

How can I resolve this problem so I can just predict values for the
supplied
matrix(newdata1) instead of it trying to use my full dataset?

Here is the full code up to this point.


library(VGAM)
mlogit<- vglm(bcsse$Active~bcsse$Impinteg+bcsse$Hsgradyr,
family=multinomial(), na.action=na.pass)
summary(mlogit)

Impinteg<-c(1,2,3,4,5,6)
Hsgradyr<-c(mean(bcsse$Hsgradyr))
newdata1<-data.frame(Impinteg,Hsgradyr)

newdata1$predicted <- predict(mlogit,newdata=newdata1,type="response")
newdata1


I appreciate all help in advance!
-- 
View this message in context:
http://www.nabble.com/Predict-Function-tp16654037p16654037.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list