[R-sig-eco] no splits possible - in mvpart
Mike Marsh
swamp at blarg.net
Sat Jan 29 02:31:12 CET 2011
I am clustering vegetation richness (0 or 1) data that is segregated by
growth form, i.e. Shrub, Annual Grass, Perennial Grass, etc., using
mvpart for comparison with clustering by hclust.
The environmental file has four variables, Slope, Elevation, heatload,
and Ecological Site (a measure of soil and land form type).
When four of the six data files are analyzed, a split is successful when
raw data are analyzed, but a message,
"No splits possible -- try decreasing cp"
appears when data standardized by "scaler" are submitted.
My question: What does the message mean? How would I decrease cp.
I have re-read De'Ath, 2002 (Ecology 83:1105) regarding
cross-validation, and I assume that xerror in the table produced by
printcp is that quantity. In the present instance, there are only two
leaves to the tree, and further reduction of cp would seem impossible
A further puzzle is that when the smallest dataset (not included in this
analysis), with only 6 columns, is analyzed, a result is obtained for
standardized data. The Shrub data resented here as an example, have 27
columns, the Annual.Forb data, 35 columns.
Here is my script, with output:
> set.seed(1)
> Shrub.mrt<-mvpart(Shrub~.,Qenv)
> printcp(Shrub.mrt)
mvpart(form = Shrub ~ ., data = Qenv)
Variables actually used in tree construction:
[1] Alt.E
Root node error: 69.727/22 = 3.1694
n= 22
CP nsplit rel error xerror xstd
1 0.23477 0 1.00000 1.1064 0.09480
2 0.12882 1 0.76523 1.0372 0.10470
> Shrub.std<- scaler(Shrub, col="mean1", row="mean1")
> Shrub.std.mrt<-mvpart(Shrub.std~.,Qenv)
No splits possible -- try decreasing cp
> printcp(Shrub.std.mrt)
rpart(formula = form, data = data)
Variables actually used in tree construction:
character(0)
Root node error: 0/0 = NaN
n=0 (22 observations deleted due to missingness)
CP nsplit rel error
1 NaN 0 NaN
>
> set.seed(1)
> Annual.Forb.mrt<-mvpart(Annual.Forb~.,Qenv)
> printcp(Annual.Forb.mrt)
mvpart(form = Annual.Forb ~ ., data = Qenv)
Variables actually used in tree construction:
[1] Slope
Root node error: 105.27/22 = 4.7851
n= 22
CP nsplit rel error xerror xstd
1 0.135579 0 1.00000 1.1085 0.081214
2 0.096179 1 0.86442 1.0827 0.079488
> Annual.Forb.std<- scaler(Annual.Forb, col="mean1", row="mean1")
> Annual.Forb.std.mrt<-mvpart(Annual.Forb.std~.,Qenv)
> printcp(Annual.Forb.std.mrt)
mvpart(form = Annual.Forb.std ~ ., data = Qenv)
Variables actually used in tree construction:
[1] Elev
Root node error: 4282.1/22 = 194.64
n= 22
CP nsplit rel error xerror xstd
1 0.15587 0 1.00000 1.1015 0.12860
2 0.10174 1 0.84413 1.0949 0.12898
> printcp(Annual.Grass.std.mrt)
mvpart(form = Annual.Grass.std ~ ., data = Qenv)
Variables actually used in tree construction:
[1] heatld
Root node error: 219.76/22 = 9.989
n= 22
CP nsplit rel error xerror xstd
1 0.12602 0 1.00000 1.1179 0.43984
2 0.11866 1 0.87398 1.4865 0.51020
>
While output for the standardized data for annual forb is the same as
with raw data, this is often not the case in my larger dataset.
data files are appended, and will be provided separately on request.
Thanks very much for looking at this.
Mike Marsh
Washington Native Plant Society
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Shrub.txt
URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20110128/7075a82f/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Annual.Grass.txt
URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20110128/7075a82f/attachment-0001.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Qenv.txt
URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20110128/7075a82f/attachment-0002.txt>
More information about the R-sig-ecology
mailing list