[R] package 'gradientForest' and 'extendedForest'
Kulupp
kulupp at online.de
Tue Aug 26 15:03:20 CEST 2014
Dear experts,
I have 5 environmental predictors and abundance data (300 samples, 60
species, transformation: log(x + min(x,x > 0) and use the function
'gradientForest' to estimate (R²-weighted) predictor importance
(regression trees). The resulting predictor importance in decreasing
order is as follows: pred1, pred2, pred3, pred4, pred5. The two species
with the highest R² (goodness-of-fit; output value 'result' of function
'gradientForest') are species 1 (R²=0.76), species 2 (R²=0.74), and
species 3 (R²=0.72). To my understanding this means that the model (i.e.
the predictor importance ranking) fits best to species 1, 2, and 3 in
decreasing order. In a further step I want to know which predictors are
the most important for selected species. Thus, I ran separate forests
using the 'extendedForest' function with the same parameter settings
(and the same set.seed()) as in the function call of 'gradientForest'
for species 1, 2, and 3 (and others). Now the resulting predictor
importance is (in decreasing order): species1: pred1, pred2, pred4,
pred3, pred5; species2: pred1, pred4, pred2, pred5, pred3; species3:
pred2, pred4, pred5, pred1, pred3. This seems strange to me, because I
believed that the 'extendedForest' function should give similar
predictor importance rankings as the 'gradientForest' predictor
importance ranking for the species with the highest R² values obtained
by 'gradientForest' . I'd be grateful for any help. Thanks a lot in
anticipation.
Best regards
Thomas
[[alternative HTML version deleted]]
More information about the R-help
mailing list