[R] Regression Tree Questions

Gary Black gwblack001 at sbcglobal.net
Sat Feb 24 20:16:27 CET 2018


Hi All,

I'm a newbie and have two questions.  Please pardon me if they are very basic.


1.  I'm using a regression tree to predict the selling prices of 10 new records (homes).  The following code is resulting in an error message:  pred <- predict(model, newdata = outOfSample[, -6]) 

The error message is:

Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = attr(object,  : 
factor Sq. Feet has new levels 1375, 1421, 1547, 1621, 1868, 2211, 2265, 2530, 2672, 3365


Does anybody know what is causing this?  I've pasted a snippet of my original dataset (Crankshaw) and my out-of-sample dataset below.  Below it appears all code which I entered leading up to that point.  The error message appears at the end of that code.


2.  How can I get the regression tree to display in a more "friendly" way?  Unfortunately I cannot paste a picture of it in this email, but it displays the values of individual records at each node instead of the decision rule logic (e.g., Age >= 28).  I'm using the command > fancyRpartPlot(model) to display the tree.


Thank you!
Gary

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


Original Data (Crankshaw):

Sq. Feet		Age	Bedrm	Bathrm	Garage	Sell Price ($)
1620		17	3	2	2	185500
1864		28	3	2	2	195250
1628		15	3	2	2	190750
1670		1	4	3	2	195750
1762		23	3	4	2	197250
1520		1	3	3	2	192900


Out-of-Sample Data:

NEW RECORDS:					
Sq. Feet		Age	Bedrm	Bathrm	Garage	Sell Price ($)
3365		8	4	4	3	
1547		28	3	2	2	
1375		36	2	1	1	
1621		53	3	1	2	
2530		23	4	3	2	
1868		42	3	2	2	
2211		23	3	2	2	
1421		39	2	1	1	
2672		3	4	2	3	
2265		7	3	2	2	


All Code Entered:

> Crankshaw <- read_excel("C:/Data/Excel/Crankshaw.xlsx")
> View(Crankshaw)
> outOfSample <- Crankshaw[305:nrow(Crankshaw), ]
> Crankshaw <- Crankshaw[1:300, ]
> install.packages("caret")
Installing package into ‘C:/Users/Jason/Documents/R/win-library/3.4’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/caret_6.0-78.zip'
Content type 'application/zip' length 5155836 bytes (4.9 MB)
downloaded 4.9 MB

package ‘caret’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\Jason\AppData\Local\Temp\RtmpmAxrJR\downloaded_packages
> install.packages("rattle")
Installing package into ‘C:/Users/Jason/Documents/R/win-library/3.4’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/rattle_5.1.0.zip'
Content type 'application/zip' length 1287407 bytes (1.2 MB)
downloaded 1.2 MB

package ‘rattle’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\Jason\AppData\Local\Temp\RtmpmAxrJR\downloaded_packages
> library(rpart)
> library(caret)
Loading required package: lattice
Loading required package: ggplot2
Warning messages:
1: package ‘caret’ was built under R version 3.4.3 
2: package ‘ggplot2’ was built under R version 3.4.3 
> library(rattle)
> n <- nrow(Crankshaw)
> train <- sample(1:n, size = 0.5 * n, replace = FALSE)
> CrankshawTrain <- Crankshaw[train, ]
> temp <- (1:n)[-train]
> val <- sample(temp, size = (0.3 / 0.5) * length(temp), replace = FALSE)
> CrankshawVal <- Crankshaw[val, ]
> test <- (1:n)[-c(train, val)]
> CrankshawTest <- Crankshaw[test, ]
> model <- rpart(`Selling Price ($)` ~ ., method = "anova", data = CrankshawTrain)
> fancyRpartPlot(model)
> pred <- predict(model, newdata = outOfSample[, -6])
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = attr(object,  : 
  factor Sq. Feet has new levels 1375, 1421, 1547, 1621, 1868, 2211, 2265, 2530, 2672, 3365


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



More information about the R-help mailing list