I'm a student. I'm working on a research using the statistical program "R
2.15.1".
Here's my problem: how i can do a regression considering only values over a
certain limit?
For example, considering the dataset "Workinghour" of the "Ecdat" package,
is possible to build a predictive model that express the probability that a
wife works more than 8 hours per day?
The dataset includes 3382 observation on the number of hours spent working
by wifes per year in USA.
hoursday=hours/240
index<-which(hoursday>=8)
hoursday[index]
As you see, I'm able to extract the values that in 'hoursday' (which is
hours/240 working days in one year) are > 8,0 but obviously i can't do a
regression cause the extracted data are a subset of the entire dataset (955
observations), while the other variables, like age, occupation, income,
etc. are still complete(3382).
So i can't do:
lm = lm(hoursday[index] ~
income+age+education+unemp+child5+child13+child17+nonwhite+owned+mortgage+occupation)
In fact "R" gives me: Error in model.frame.default(formula =
hoursday[index] ~ income, drop.unused.levels = TRUE) : variable lengths
differ (found for 'income').
Can you help me?
Thank you.
Giorgio
[[alternative HTML version deleted]]