[R] Lda and Qda
pedrosmarques at portugalmail.pt
pedrosmarques at portugalmail.pt
Fri Dec 28 00:14:26 CET 2007
Hi all,
I'm working with some data: 54 variables and a column of classes, each observation as one of a possible seven different classes:
> var.can3<-lda(x=dados[,c(1:28,30:54)],grouping=dados[,55],CV=TRUE)
Warning message:
In lda.default(x, grouping, ...) : variables are collinear
> summary(var.can3)
Length Class Mode
class 30000 factor numeric ### why?? I don't understand it
posterior 210000 -none- numeric
call 4 -none- call ## what's this?
> var.can<-lda(dados[,c(1:28,30:54)],dados[,55])#porque a variavel 29 é constante
Warning message:
In lda.default(x, grouping, ...) : variables are collinear
> summary(var.can)
Length Class Mode
prior 7 -none- numeric
counts 7 -none- numeric
means 371 -none- numeric
scaling 318 -none- numeric
lev 7 -none- character
svd 6 -none- numeric
N 1 -none- numeric
call 3 -none- call
> (normalizar<-function(matriz){ n<-dim(matriz)[1]; m<-dim(matriz)[2]; normas<-sqrt(colSums(matriz*matriz)); matriz.normalizada<-matriz/t(matrix(rep(normas,n),m,n));return(matriz.normalizada)})
function(matriz){ n<-dim(matriz)[1]; m<-dim(matriz)[2]; normas<-sqrt(colSums(matriz*matriz)); matriz.normalizada<-matriz/t(matrix(rep(normas,n),m,n));return(matriz.normalizada)}
> var.canonicas<-as.matrix(dados[,c(1:28,30:54)])%*%(normalizar(var.can$scaling))
> summary(var.canonicas)
LD1 LD2 LD3 LD4
Min. :-21.942 Min. :-6.820 Min. :-10.138 Min. :-6.584
1st Qu.:-20.014 1st Qu.:-5.480 1st Qu.: -8.280 1st Qu.: 0.872
Median :-19.495 Median :-5.007 Median : -7.800 Median : 1.083
Mean :-18.827 Mean :-4.760 Mean : -7.803 Mean : 1.134
3rd Qu.:-18.975 3rd Qu.:-4.456 3rd Qu.: -7.278 3rd Qu.: 1.311
Max. : -7.886 Max. : 3.116 Max. : -1.619 Max. : 5.556
LD5 LD6
Min. :-11.083 Min. :-4.4972
1st Qu.: -1.237 1st Qu.:-1.6497
Median : -1.100 Median :-1.0909
Mean : -1.100 Mean :-0.9808
3rd Qu.: -0.957 3rd Qu.:-0.4598
Max. : 4.712 Max. : 7.5356
>
I don't know wether I need to specify a training set and a testing set, I also don't know the error nor the classifier; shouldn't the lenght of class of var.can3 be 7 since I only have 7 different classes?
Best regards,
Pedro Marques
More information about the R-help
mailing list