[R] why results from regression tree (rpart) are totallyinconsistent with ordinary regression
Bert Gunter
gunter.berton at gene.com
Tue Feb 24 00:14:26 CET 2009
You did not read the tree graph correctly. Mortality is **not** "positively
related" to incidence. You're reading the tree backwards. Read the output
of summary() on your rpart fit object for clarity.
-- Bert Gunter
Genentech
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Weidong Gu
Sent: Monday, February 23, 2009 2:39 PM
To: r-help at r-project.org
Subject: [R] why results from regression tree (rpart) are
totallyinconsistent with ordinary regression
Hi,
In my analysis of impacts of insecticide-treated bednets on malaria, I
look at the relationship between malaria incidence and mosquito
behaviors. The condensed data set is copied here. Ordinary regression
(lm) shows that Incidence was negatively related to Mortality. This
makes sense because the latter reflected the strength of killing
mosquitoes by insecticide-treated nets. Since the original data set has
a complex structure with more parameters and scenarios. I guess a tree
model would help explore the structure of the data. However, regression
tree (rpart(Incidence~Mortality+Deterrence)) indicates that Mortality
was positively related to Incidence.
How this unintuitive result? Advice is appreciated.
Weidong Gu,
Department of Medicine
University of Alabama, Birmingham
Deterrence Mortality Incidence
0.695 0.51 66
0.255 0.501 48
0.612 0.483 55
0.209 0.158 47
0.499 0.589 53
0.755 0.285 73
0.764 0.351 77
0.749 0.211 64
0.101 0.336 45
0.556 0.066 72
0.576 0.403 45
0.232 0.667 35
0.424 0.891 34
0.432 0.458 54
0.197 0.269 59
0.188 0.523 40
0.291 0.864 32
0.504 0.791 36
0.387 0.138 66
0.71 0.676 56
0.235 0.183 59
0.358 0.579 41
0.718 0.57 49
0.775 0.254 46
0.269 0.633 42
0.443 0.741 40
0.28 0.438 49
0.385 0.778 37
0.539 0.653 37
0.73 0.094 84
0.489 0.611 40
0.595 0.431 39
0.305 0.003 69
0.511 0.595 37
0.394 0.798 37
0.369 0.541 47
0.414 0.552 51
0.468 0.858 34
0.311 0.201 59
0.142 0.36 43
0.514 0.195 46
0.365 0.325 48
0.608 0.224 67
0.177 0.04 62
0.475 0.146 65
0.526 0.702 46
0.735 0.372 43
0.172 0.66 36
0.622 0.531 53
0.651 0.055 76
0.223 0.296 54
0.783 0.566 52
0.439 0.698 34
0.527 0.493 41
0.766 0.89 49
0.634 0.749 42
0.24 0.732 35
0.792 0.764 36
0.268 0.823 34
0.418 0.407 53
0.251 0.241 54
0.705 0.843 40
0.546 0.474 55
0.685 0.384 62
0.582 0.086 72
0.63 0.618 57
0.131 0.028 56
0.555 0.803 41
0.463 0.299 57
0.154 0.164 55
0.406 0.074 66
0.168 0.118 58
0.597 0.323 47
0.672 0.816 42
0.698 0.623 48
0.676 0.177 43
0.743 0.109 81
0.121 0.244 49
0.799 0.014 85
0.45 0.645 36
0.484 0.448 52
0.585 0.307 68
0.348 0.417 43
0.345 0.459 44
0.374 0.835 30
0.657 0.134 65
0.331 0.022 67
0.141 0.045 66
0.568 0.1 67
0.11 0.876 30
0.212 0.39 46
0.298 0.519 40
0.322 0.721 44
0.201 0.77 35
0.641 0.855 39
0.156 0.277 48
0.327 0.714 40
0.663 0.231 44
0.119 0.688 37
0.287 0.354 46
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list