[R] LDA Precdict - Seems to be predicting on the Training Data
Tony Plate
tplate at acm.org
Tue Oct 20 17:23:13 CEST 2009
Maybe you're getting strange results because you're not supplying a data object to lda() when you build your fit.
When I do it the "standard" way, predict.lda() uses the new data and produces a result of length 6 as expected:
> myDat <- read.csv("clipboard", sep="\t")
> fit <- lda(c1 ~ v1 + v2 + v3, data=myDat[1:10,])
> predict(fit, myDat[11:16,])
$class
[1] c c c b c a
Levels: a b c
...
>
-- Tony Plate
BostonR wrote:
> When I import a simple dataset, run LDA, and then try to use the model to
> forecast out of sample data, I get a forecast for the training set not the
> out of sample set. Others have posted this question, but I do not see the
> answers to their posts.
>
> Here is some sample data:
>
> Date Names v1 v2 v3 c1
> 1/31/2009 Name1 0.714472361 0.902552278 0.783353694 a
> 1/31/2009 Name2 0.512158919 0.770451596 0.111853346 a
> 1/31/2009 Name3 0.470693282 0.129200065 0.800973877 a
> 1/31/2009 Name4 0.24236898 0.472219638 0.486599763 b
> 1/31/2009 Name5 0.785619735 0.628511593 0.106868172 b
> 1/31/2009 Name6 0.718718387 0.697257275 0.690326648 b
> 1/31/2009 Name7 0.327331186 0.01715109 0.861421706 c
> 1/31/2009 Name8 0.632011743 0.599040196 0.320741634 c
> 1/31/2009 Name9 0.302804404 0.475166304 0.907143632 c
> 1/31/2009 Name10 0.545284813 0.967196462 0.945163717 a
> 1/31/2009 Name11 0.563720418 0.024862018 0.970685281 a
> 1/31/2009 Name12 0.357614427 0.417490445 0.415162276 a
> 1/31/2009 Name13 0.154971203 0.425227967 0.856866993 b
> 1/31/2009 Name14 0.935080173 0.488659307 0.194967973 a
> 1/31/2009 Name15 0.363069339 0.334206603 0.639795596 b
> 1/31/2009 Name16 0.862889297 0.821752532 0.549552875 a
>
> Attached is the code:
>
> myDat <-read.csv(file="f:\\Systematiq\\data\\TestData.csv",
> header=TRUE,sep=",")
> myData <- data.frame(myDat)
>
> length(myDat[,1])
>
> train <- myDat[1:10,]
> outOfSample <- myDat[11:16,]
> outOfSample <- (cbind(outOfSample$v1,outOfSample$v2,outOfSample$v3))
> outOfSample <-data.frame(outOfSample)
>
> length(train[,1])
> length(outOfSample[,1])
>
> fit <- lda(train$c1~train$v1+train$v2+train$v3)
>
> forecast <- predict(fit,outOfSample)$class
>
> length(forecast)##### I am expecting this to be same as
> lengthoutOfSample[,1]), which is 6
>
> Output:
>
> length(forecast)##### I am expecting this to be same as
> lengthoutOfSample[,1]), which is 6
> [1] 10
>
>
>
>
>
>
More information about the R-help
mailing list