[R] MSE Cross-validation with factor interactions terms MARS regression
peter dalgaard
pd@|gd @end|ng |rom gm@||@com
Tue Oct 30 00:30:06 CET 2018
The two lines did the same thing, so little wonder...
More likely, the culprit is that a is assigned in the global environment, and then used in a prediction on a subset.
Also,
- you are defining Training, but as far as I can tell, you're not using it. Not likely to be an issue in itself, but wouldn't you want to fit on the Training set and evaluate on the Testing?
- your model de facto contains both education as a numeric predictor and as.factor(education) as well as the interaction term age:as.factor(education). Does that make sense modelling-wise??
-pd
> On 29 Oct 2018, at 23:50 , varin sacha via R-help <r-help using r-project.org> wrote:
>
> Hi Bert,
>
> Many thanks, I have fixed it but it still don't work... .
> Best,
>
>
>
>
>
>
> Le lundi 29 octobre 2018 à 22:07:26 UTC+1, Bert Gunter <bgunter.4567 using gmail.com> a écrit :
>
>
>
>
>
> I did no analysis of your code or thought process, but noticed that you had the following two successive lines in your code:
>
>
> y=Testing$wage
>
> y=Wage[-sam,]$wage
>
> This obviously makes no sense, so maybe you should fix this first and then proceed.
>
> -- Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Oct 29, 2018 at 1:46 PM varin sacha via R-help <r-help using r-project.org> wrote:
>>
>> Dear R-experts,
>> I am having trouble while doing crossvalidation with a MARS regression including an interaction term between a factor variable (education) and 1 continuous variable (age). How could I solve my problem ?
>>
>> Here below my reproducible example.
>>
>> #######
>>
>> install.packages("ISLR")
>>
>> library(ISLR)
>>
>> install.packages("earth")
>>
>> library(earth)
>>
>> a<-as.factor(Wage$education)
>>
>> # Create a list to store the results
>>
>> lst<-list()
>>
>> # This statement does the repetitions (looping)
>>
>> for(i in 1 :200) {
>>
>> n=dim(Wage)[1]
>>
>> p=0.667
>>
>> sam=sample(1 :n,floor(p*n),replace=FALSE)
>>
>> Training =Wage [sam,]
>>
>> Testing = Wage [-sam,]
>>
>> mars5<-earth(wage~age+education+year+age*a, data=Wage)
>>
>> ypred=predict(mars5,newdata=Testing)
>>
>> y=Testing$wage
>>
>> y=Wage[-sam,]$wage
>>
>> MSE = mean(y-ypred)^2
>>
>> MSE
>>
>> lst[i]<-MSE
>>
>> }
>>
>> mean(unlist(lst))
>>
>> summary(mars5)
>>
>> #######
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
More information about the R-help
mailing list