[R] Lda and Qda

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Dec 28 08:59:13 CET 2007


?lda explains the object produced: please do study it.

Hint: you asked for leave-one-out cross-validation, and what is the output 
from cross-validation of a classifer?  The predicted class for each 
observation.  How many observations do you have?

You are using software from a contributed package without credit, and that 
software is support for a book (see library(help=MASS) and the help page). 
Please consult the book for the background.

On Thu, 27 Dec 2007, pedrosmarques at portugalmail.pt wrote:

>
>
> Hi all,
>
> I'm working with some data: 54 variables and a column of classes, each 
> observation as one of a possible seven different classes:
>
>> var.can3<-lda(x=dados[,c(1:28,30:54)],grouping=dados[,55],CV=TRUE)
> Warning message:
> In lda.default(x, grouping, ...) : variables are collinear
>> summary(var.can3)
>          Length Class  Mode
> class      30000 factor numeric   ### why?? I don't understand it
> posterior 210000 -none- numeric
> call           4 -none- call    ## what's this?
>
>
>> var.can<-lda(dados[,c(1:28,30:54)],dados[,55])#porque a variavel 29 é constante
> Warning message:
> In lda.default(x, grouping, ...) : variables are collinear
>> summary(var.can)
>        Length Class  Mode
> prior     7    -none- numeric
> counts    7    -none- numeric
> means   371    -none- numeric
> scaling 318    -none- numeric
> lev       7    -none- character
> svd       6    -none- numeric
> N         1    -none- numeric
> call      3    -none- call
>> (normalizar<-function(matriz){ n<-dim(matriz)[1]; m<-dim(matriz)[2]; normas<-sqrt(colSums(matriz*matriz)); matriz.normalizada<-matriz/t(matrix(rep(normas,n),m,n));return(matriz.normalizada)})
> function(matriz){ n<-dim(matriz)[1]; m<-dim(matriz)[2]; normas<-sqrt(colSums(matriz*matriz)); matriz.normalizada<-matriz/t(matrix(rep(normas,n),m,n));return(matriz.normalizada)}
>> var.canonicas<-as.matrix(dados[,c(1:28,30:54)])%*%(normalizar(var.can$scaling))
>> summary(var.canonicas)
>      LD1               LD2              LD3               LD4
> Min.   :-21.942   Min.   :-6.820   Min.   :-10.138   Min.   :-6.584
> 1st Qu.:-20.014   1st Qu.:-5.480   1st Qu.: -8.280   1st Qu.: 0.872
> Median :-19.495   Median :-5.007   Median : -7.800   Median : 1.083
> Mean   :-18.827   Mean   :-4.760   Mean   : -7.803   Mean   : 1.134
> 3rd Qu.:-18.975   3rd Qu.:-4.456   3rd Qu.: -7.278   3rd Qu.: 1.311
> Max.   : -7.886   Max.   : 3.116   Max.   : -1.619   Max.   : 5.556
>      LD5               LD6
> Min.   :-11.083   Min.   :-4.4972
> 1st Qu.: -1.237   1st Qu.:-1.6497
> Median : -1.100   Median :-1.0909
> Mean   : -1.100   Mean   :-0.9808
> 3rd Qu.: -0.957   3rd Qu.:-0.4598
> Max.   :  4.712   Max.   : 7.5356
>>
>
>
> I don't know wether I need to specify a training set and a testing set, 
> I also don't know the error nor the classifier; shouldn't the lenght of 
> class of var.can3 be 7 since I only have 7 different classes?
>
> Best regards,
>
> Pedro Marques
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-help mailing list