[R] Graphical Manova: Fails When There Are Three Factors
Bryan Hanson
hanson at depauw.edu
Fri Nov 16 16:29:44 CET 2007
Hi R Gurus & Lurkers... Thanks in advance to anyone who is willing to
tackle this! Bryan
I have been implementing the graphical manova method described in "An
Introduction to Ggobi" (from the Ggobi web site). A stand alone working
code is appended below. The code is almost the same as described in the
"Introduction document", with one bug fix. A quick summary of the code is
that it takes one's data, and fits an ellipsoid to it at a requested
confidence level, then color codes everything for display. If you don't
have ggobi installed, remove the ggobi from the last line and just use
>graphic_manova(response,class)
You will probably have to comment out the last 4 lines of the graphic_manova
function as well to avoid trivial errors.
Here's the R question: If the variable "class" has more than two levels
(factors) in it, the code executes but runs into an error because
cis = lapply(sub.groups, combined, cl=cl)
creates "cis" with a bunch of NA's, which then cause havoc when one tries to
do any matrix operations on it (not surprisingly). The NA's follow an
interesting pattern: the ellipsoid points are generated for the first two
dimensions (pc1 and pc2), but NA's are generated for the third dimension
(pc3). So cis contains the 3 original data dimensions, 1000 added ellipsoid
points to go with pc1, and 1000 added ellipsoid points to go with pc2, and
1000 NA's to go with pc3.... I don't see why the third set of data is any
different than the first two, and the first two execute correctly.
********
# generate sample data
pc1=rnorm(20, sd=1)
pc2=rnorm(20, mean = 10, sd=2)
pc3=rnorm(20, sd=3)
class=factor(c("group 1","group 1","group 1","group 2","group 2","group
2","group 2","group 2","group 2","group 2","group 1","group 1","group
1","group 1","group 2","group 2","group 2","group 2","group 1","group 2"),
ordered=TRUE)
response=cbind(pc1, pc2, pc3)
# Now generate confidence ellipsoids using the method described
# in "An Introduction to RGGOBI" with minor modifications
# Define 3 functions to do the heavy lifting
# First: a function that generates a random set of points on a sphere
# centered on the mean of the passed data, skewed to match the variance
# of the passed data (which turns the sphere into an ellipsoid),
# and adjusted in size to match the desired confidence level.
ellipse = function(data, npoints=1000, cl=0.95, mean=colMeans(data),
cov=var(data),n=nrow(data))
{
norm.vec = function(x) x/sqrt(sum(x^2))
p = length(mean)
ev = eigen(cov)
sphere = matrix(rnorm(npoints*p), ncol=p)
cntr = t(apply(sphere, 1, norm.vec)) # normalized sphere
cntr = cntr %*% diag(sqrt(ev$values)) %*% t(ev$vectors) # ellipsoid of
correct shape
Fcrit = qf(cl, p, n-p)
scalefactor = sqrt((p*(n-1))/(n*(n-p)))*Fcrit
cntr = cntr*scalefactor # ellipsoid of correct size
if (!missing(data)) # only relevant when no data passed
colnames(cntr) = colnames(data)
cntr+rep(mean, each=npoints)
}
# Next a function that combines the original data with the generated
ellipsoid
combined = function(data, cl=0.95)
{
dm = data.matrix(data)
ellipse = as.data.frame(ellipse(dm, npoints=1000, cl=cl))
both = rbind(data, ellipse)
both$SIM = factor(rep(c(FALSE,TRUE),c(nrow(data),1000)))
both
}
# Now a function to separate the dataset into categories
graphic_manova = function(data, catvar, cl=0.68)
{
sub.groups = data.frame(cbind(data,catvar))
sub.groups = split(sub.groups,catvar)
cis = lapply(sub.groups, combined, cl=cl)
df = as.data.frame(do.call(rbind, cis))
df$var = factor(rep(names(cis), sapply(cis, nrow)))
g = ggobi(df)
glyph_type(g[1]) = c(6,1)[df$SIM] # makes dots of ellipsoids tiny
glyph_color(g[1]) = df$var # properly colors the two groups
invisible(g)
}
# Now actually do the computations & plot the data!
# ggobi(combined(response)) # This is a debugging check point
ggobi(graphic_manova(response,class))
More information about the R-help
mailing list