[R] Problems using rfImpute
James Reilly
reilly at stat.auckland.ac.nz
Mon May 5 15:31:21 CEST 2008
The values NA and "NA" are different. The first is treated as missing;
the second is not. For example,
> table(factor(c(NA,"0","1","NA","NA")))
0 1 NA
1 1 2
I suspect you have "NA" where you want NA, and this is causing your problem.
James
--
James Reilly
Department of Statistics, University of Auckland
Private Bag 92019, Auckland, New Zealand
On 6/5/08 1:04 AM, Birgit Lemcke wrote:
> Hello R-user!
>
> I am running R 2.7.0 on a Power Book (Tiger). (I am still R and
> statistics beginner)
>
> I tried rfImpute (randomForest) and as far as I understood should it
> replace NA`s using a proximity matrix:
>
> > set.seed(100000)
> > Subset5Imputed<-rfImpute(Sex~., data=Subset5)
> ntree OOB 1 2
> 300: 11.78% 12.36% 11.21%
> ntree OOB 1 2
> 300: 12.07% 12.64% 11.49%
> ntree OOB 1 2
> 300: 11.49% 11.21% 11.78%
> ntree OOB 1 2
> 300: 12.50% 12.93% 12.07%
> ntree OOB 1 2
> 300: 12.07% 12.36% 11.78%
> > str(Subset5Imputed)
>
> 'data.frame': 696 obs. of 24 variables:
> $ Sex : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2
> 2 2 2 ...
> $ InfSpath_caducuous : Factor w/ 3 levels "0","1","NA": 1 1 1 1
> 1 1 1 1 1 1 ...
> $ InfType_sparsely_paniculate: Factor w/ 3 levels "0","1","NA": 1 1 1 3
> 1 1 1 1 1 1 ...
>
> But there are still NA`s in the data frame. Sorry if this reason is only
> ma stupididty and thanks for answering in advance.
>
> B.
>
>
> Birgit Lemcke
> Institut für Systematische Botanik
> Zollikerstrasse 107
> CH-8008 Zürich
> Switzerland
> Ph: +41 (0)44 634 8351
> birgit.lemcke at systbot.uzh.ch
>
> 175 Jahre UZH
> «staunen.erleben.begreifen. Naturwissenschaft zum Anfassen.»
> MNF-Jubiläumsevent für gross und klein.
> 19. April 2008, 10.00 Uhr bis 02.00 Uhr
> Campus Irchel, Winterthurerstrasse 190, 8057 Zürich
> Weitere Informationen http://www.175jahre.uzh.ch/naturwissenschaft
More information about the R-help
mailing list