[R] conflicting results on NA in a qda predicted object:
Agustin Lobo
alobo at ija.csic.es
Thu Dec 20 12:25:32 CET 2001
Using unclass I'm still very confused:
> unique(mod23S.qda.pred$class)
[1] 12 17 8 10 4 9 5 13 14 19 20 15 6 3 7 1 23 11 18 21 16 2 22 NA
Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
> unique(unclass(mod23S.qda.pred$class))
[1] 12 17 8 10 4 9 5 13 14 19 20 15 6 3 7 1 23 11 18
[20] 21 16 2 22 262
I think that the NA is related to the 262, as there should be
only 23 classes.
The data used in predict.qda seem correct (only cols X2 to X5
are used, col X1 are the individual labels):
> summary(liss.seg.medias)
X1 X2 X3 X4
Min. : 1 Min. : 57.80 Min. : 17.00 Min. : 34.94
1st Qu.: 6594 1st Qu.: 78.50 1st Qu.: 26.50 1st Qu.: 83.50
Median :13188 Median : 89.72 Median : 33.40 Median : 91.43
Mean :13188 Mean : 95.18 Mean : 37.01 Mean : 92.47
3rd Qu.:19782 3rd Qu.:106.47 3rd Qu.: 44.50 3rd Qu.:100.50
Max. :26375 Max. :245.29 Max. :125.25 Max. :156.82
X5
Min. : 65.0
1st Qu.:108.4
Median :128.4
Mean :134.2
3rd Qu.:155.7
Max. :254.3
Also, the qda object semms correct:
> str(mod23.qda)
List of 8
$ prior : Named num [1:23] 0.0842 0.0485 0.0357 0.0332 0.0357 ...
..- attr(*, "names")= chr [1:23] "1" "2" "3" "4" ...
$ counts : Named int [1:23] 33 19 14 13 14 41 33 8 11 14 ...
..- attr(*, "names")= chr [1:23] "1" "2" "3" "4" ...
$ means : num [1:23, 1:4] 71.4 68.9 72.9 81.5 92.6 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:23] "1" "2" "3" "4" ...
.. ..$ : chr [1:4] "lissb2" "lissb3" "lissb4" "lissb5"
$ scaling: num [1:4, 1:4, 1:23] 0.463 0.000 0.000 0.000 1.149 ...
..- attr(*, "dimnames")=List of 3
.. ..$ : chr [1:4] "lissb2" "lissb3" "lissb4" "lissb5"
.. ..$ : chr [1:4] "1" "2" "3" "4"
.. ..$ : chr [1:23] "1" "2" "3" "4" ...
$ ldet : num [1:23] 4.38 4.28 7.03 5.77 10.48 ...
$ lev : chr [1:23] "1" "2" "3" "4" ...
$ N : int 392
$ call : language qda.matrix(x = mod23[, 2:5], grouping = mod23[, 6])
- attr(*, "class")= chr "qda"
Finally, I can detect the individual, but don't think
it's a rare one:
> b <- unclass(mod23S.qda.pred$class)
> b[b==262]
[1] 262
> liss.seg.medias[b==262,1]
[1] 11385
> liss.seg.medias[liss.seg.medias[,1]==11385,]
[1] 11385.0000 70.7619 22.8095 78.0476 90.6667
11385 is actually similar to its
neighbors:
> liss.seg.medias[liss.seg.medias[,1]==11384,]
[1] 11384.0000 74.8462 24.8462 89.3077 97.0000
> liss.seg.medias[liss.seg.medias[,1]==11386,]
[1] 11386.0000 71.2857 22.4286 88.8571 95.9286
Why does predict.qda assign a non-existent class (262 or NA)
to individual 11385 ?
Thanks for the help and sorry for the length
of the message.
Agus
On Thu, 20 Dec 2001, Prof Brian Ripley wrote:
> This is a factor. You have to be careful with NAs in factors (and 1.4.0
> is different there as it happens).
>
> Nevertheless, there is no way to reproduce this from what you have given.
> Check that the class really is "factor", and then unclass it to see what
> the codes actually are. One or more of them should be NA from what you
> have given.
>
>
> On Thu, 20 Dec 2001, Agustin Lobo wrote:
>
> >
> > Dear list,
> >
> > (I've not upgraded to R1.4 yet)
> >
> > I have the following $class component in a predict.qda object:
> > > unique(mod23S.qda.pred$class)
> > [1] 12 17 8 10 4 9 5 13 14 19 20 15 6 3 7 1 23 11 18 21 16 2 22 NA
> > Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
> >
> > Nevertheless, when I try to identify the individual(s) with NA, I get:
> > > any(is.na(mod23S.qda.pred$class))
> > [1] FALSE
> >
> > and
> >
> > > mod23S.qda.pred$class[is.na(mod23S.qda.pred$class)]
> > factor(0)
> > Levels: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
> >
> > So, actually, is there a NA value in mod23S.qda.pred$class or not?
> >
> > (screening by eye it`s impossible:
> > length(mod23S.qda.pred$class) is 26375 )
> >
> > Agus
> >
> > Dr. Agustin Lobo
> > Instituto de Ciencias de la Tierra (CSIC)
> > Lluis Sole Sabaris s/n
> > 08028 Barcelona SPAIN
> > tel 34 93409 5410
> > fax 34 93411 0012
> > alobo at ija.csic.es
> >
> >
> >
> > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > Send "info", "help", or "[un]subscribe"
> > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> >
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272860 (secr)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list