[R] troubles with logistic regression
gked
grigoriy.lyukshin at gmail.com
Sun Mar 13 19:33:25 CET 2011
hello everyone,
I working on the dataset for my project in class and got stuck on trying to
run logistic regression. here is my code:
data <- read.csv(file="C:/Users/fieder.data.2000.csv")
# creating subset of men
fieder.male<-subset(data,data[,8]==1)
unmarried.male<-subset(data,data[,8]==1&data[,6]==1)
# glm fit
agesq.male<-(unmarried.male[,5])^2
male.sqrtincome<-sqrt(unmarried.male[,9])
fieder.male.mar.glm<-glm(as.factor(unmarried.male[,6])~
factor(fieder.male[,7])+fieder.male[,5]+agesq.male+
male.sqrtincome,binomial(link="logit") )
par(mfrow=c(1,1))
plot(c(0,300),c(0,1),pch=" ",
xlab="sqrt income, truncated at 90000",
ylab="modeled probability of being never-married")
junk<- lowess(male.sqrtincome,
log(fieder.male.mar.glm$fitted.values/
(1-fieder.male.mar.glm$fitted.values)))
lines(junk$x,exp(junk$y)/(1+exp(junk$y)))
title(main="probability of never marrying\n males, by sqrt(income)")
points(male.sqrtincome[unmarried.male==0],
fieder.male.mar.glm$fitted.values[unmarried.male==0],pch=16)
points(male.sqrtincome[unmarried.male==1],
fieder.male.mar.glm$fitted.values[unmarried.male==1],pch=1)
The error says:
Error in model.frame.default(formula = as.factor(unmarried.male[, 6]) ~ :
variable lengths differ (found for 'factor(fieder.male[, 7])')
What does it mean? Where am i making a mistake?
Thank you
P.S. i am also attaching data file in .csv format
http://r.789695.n4.nabble.com/file/n3352356/fieder.data.2000.csv
fieder.data.2000.csv
--
View this message in context: http://r.789695.n4.nabble.com/troubles-with-logistic-regression-tp3352356p3352356.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list