[BioC] pamr Error: each class must have >1 sample
Dick Beyer
dbeyer at u.washington.edu
Wed Jul 28 23:00:06 CEST 2004
Hi Kasper,
Thanks for pointing out my problem with pamr.train. On closer examination, my problem seems slightly different than what I asked about earlier as it is occurring in pamr.cv.
Every class has 3 samples, so pamr.train is ok, but not pamr.cv:
>table(z)
z
1 2 3 4 5 6 7 8
3 3 3 3 3 3 3 3
>my.data <- list(x=dendmat,y=factor(z))
>my.train <- pamr.train(my.data)
123456789101112131415161718192021222324252627282930
> my.cv <- pamr.cv(my.train, my.data)
Fold 1 :Error in nsc(x[, -folds[[ii]]], y = argy[-folds[[ii]]], x[, folds[[ii]], :
Error: each class must have >1 sample
Has anyone seen this in pamr.cv before?
I am using Windows:
base 1.9.1
utils 1.9.1
graphics 1.9.1
stats 1.9.1
methods 1.9.1
pamr 1.21
cluster 1.9.4
e1071 1.4-1
xtable 1.2-3
Thanks very much,
Dick
*******************************************************************************
Richard P. Beyer, Ph.D. University of Washington
Tel.:(206) 616 7378 Env. & Occ. Health Sci. , Box 354695
Fax: (206) 685 4696 4225 Roosevelt Way NE, # 100
Seattle, WA 98105-6099
http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
*******************************************************************************
On Wed, 28 Jul 2004, Kasper Daniel Hansen wrote:
> Dick Beyer <dbeyer at u.washington.edu> writes:
>
> > I am having trouble with pamr.train and subsequently pamr.cv.
> >
> > In the pamr documentation, the following works:
> >
> > set.seed(120)
> > x <- matrix(rnorm(1000*20),ncol=20)
> > y <- sample(c(1:4),size=20,replace=TRUE)
> > mydata <- list(x=x,y=y)
> > mytrain <- pamr.train(mydata)
> > mycv <- pamr.cv(mytrain,mydata)
> >
> > But if you change the seed, it doesn't:
> >
> > set.seed(1123)
> > x <- matrix(rnorm(1000*20),ncol=20)
> > y <- sample(c(1:4),size=20,replace=TRUE)
> > mydata <- list(x=x,y=y)
> > mytrain <- pamr.train(mydata)
> > Error in nsc(data$x[gene.subset, sample.subset], y = y, proby = proby, :
> > Error: each class must have >1 sample
> >
> > There is discussion in the documents (http://www-stat.stanford.edu/~tibs/PAM/Rdist/doc/readme.html) about "fragile" functions, but I have not been able to understand how to make this error go away. If anyone has had this problem or has some advice, I would be eternally grateful.
>
> If you look at the y-ector you will notice it look like this
> > table(y)
> y
> 1 2 3 4
> 1 6 5 8
>
> Hence there is only 1 sample with a class of "1". Of course this
> happens when you sample 20 times from a set of 4 values. From the error
> message it seems that the method requires at least two samples from
> every class.
>
> Possible solutions (quick solutions, I am not to familiar with pamr):
> - increase the size, so that a class with only one sample is very
> unlikely.
> - fit the data, disregarding the single sample and using only 3
> classes
>
> /Kasper
>
> --
> Kasper Daniel Hansen, Research Assistant
> Department of Biostatistics, University of Copenhagen
>
More information about the Bioconductor
mailing list