[R] tree() producing NA's

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Feb 11 09:32:20 CET 2008


Take a look at the levels of 'owner'.

On Mon, 11 Feb 2008, Amnon Melzer wrote:

> Hi
>
>
>
> Hoping someone can help me (a newbie).
>
>
>
> I am trying to construct a tree using tree() in package tree. One of the
> fields is a factor field (owner), with many levels. In the resulting tree, I
> see many NA's (see below), yet in the actual data there are none.

You are misinterpreting this: those are level names.

Using a tree with a factor with many levels is a very bad idea: it takes a 
long time to compute (unless the response is binary) and almost surely 
overfits.

>
>
>> rr200.tr <- tree(backprof ~ ., rr200)
>
>> rr200.tr
>
> 1) root 200 1826.00 -0.2332
>
> ...
>
> [snip]
>
> ...
>
>    5) owner: Cliveden Stud,NA,NA,NA,NA,NA,NA,NA,NA 10   14.25  1.5870 *
>
>  3) owner: B E T Partnership,Flaming Sambuca
> Syndicate,NA,NA,NA,NA,NA,NA,NA,NA 11  384.40 10.5900
>
>    6) decodds < 12 5   74.80  6.3000 *
>
>    7) decodds > 12 6  140.80 14.1700 *
>
>
>
> Can anyone tell me why this happens and what I can do about it?

Well, you could follow the request at the footer of this and every R-help 
message.

>
>
> Regards
>
>
>
> Amnon
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list