[R] pvclust:a general and a specific question

asl at tmail.lanl.gov asl at tmail.lanl.gov
Wed Jun 25 16:57:52 CEST 2008


I realize questions about packages should go to the package maintainer,
but perhaps I have an old email address (suzuki3 at is.titech.ac.jp)
Also I have both a general, and a specific, question.

1) General question: i've used pvclust before to assess significance of
clusters and got reasonable results. However, on a new data set (see
below) the results seem odd. I wonder if pvclust is a generally used
package to assess cluster signficance, or if another package/approach is
considered standard? The "approximately unbiased" feature of pvclust
compared to regular boostrapping seems attractive.

2) Specific question: the odd result I am getting concerns a tree with a
very clear division into two very distinct   top level clusters.  However
on this data set the subclusters with confidence appear low down in the
tree, and the very top most division gets zero significance. I'm
suspicious of this given the rather clear top-level clade structure in
this data set with lots of examples and not many NA's, i.e. pretty vanilla
data. Also, in a related data set there seems to be a crash: pvclst
bootstraps and scales happily for a while, then prints:
Bootstrap (r = 1.29)... Done.
Bootstrap (r = 1.29)... Done.
Error in solve.default(crossprod(X, X/vv)) :
  Lapack routine dgesv: system is exactly singular
In addition: Warning message:
In lsfit(X, zz, 1/vv, intercept = FALSE) : 'X' matrix was collinear

Thank you
Alan



More information about the R-help mailing list