[R] Graphical Manova: Fails When There Are Three Factors

Fri Nov 16 16:29:44 CET 2007

Hi R Gurus &  Lurkers...  Thanks in advance to anyone who is willing to
tackle this!  Bryan

I have been implementing the graphical manova method described in "An
Introduction to Ggobi" (from the Ggobi web site).  A stand alone working
code is appended below.  The code is almost the same as described in the
"Introduction document", with one bug fix.  A quick summary of the code is
that it takes one's data, and fits an ellipsoid to it at a requested
confidence level, then color codes everything for display.  If you don't
have ggobi installed, remove the ggobi from the last line and just use
>graphic_manova(response,class)
You will probably have to comment out the last 4 lines of the graphic_manova
function as well to avoid trivial errors.

Here's the  R question: If the variable "class" has more than two levels
(factors) in it, the code executes but runs into an error because

cis = lapply(sub.groups, combined, cl=cl)

creates "cis" with a bunch of NA's, which then cause havoc when one tries to
do any matrix operations on it (not surprisingly).  The NA's follow an
interesting pattern: the ellipsoid points are generated for the first two
dimensions (pc1 and pc2), but NA's are generated for the third dimension
(pc3).  So cis contains the 3 original data dimensions, 1000 added ellipsoid
points to go with pc1, and 1000 added ellipsoid points to go with pc2, and
1000 NA's to go with pc3.... I don't see why the third set of data is any
different than the first two, and the first two execute correctly.

********
# generate sample data

pc1=rnorm(20, sd=1)
pc2=rnorm(20, mean = 10, sd=2)
pc3=rnorm(20, sd=3)
class=factor(c("group 1","group 1","group 1","group 2","group 2","group
2","group 2","group 2","group 2","group 2","group 1","group 1","group
1","group 1","group 2","group 2","group 2","group 2","group 1","group 2"),
ordered=TRUE)
response=cbind(pc1, pc2, pc3)

# Now generate confidence ellipsoids using the method described
# in "An Introduction to RGGOBI" with minor modifications

# Define 3 functions to do the heavy lifting

# First: a function that generates a random set of points on a sphere
# centered on the mean of the passed data, skewed to match the variance
# of the passed data (which turns the sphere into an ellipsoid),
# and adjusted in size to match the desired confidence level.

ellipse = function(data, npoints=1000, cl=0.95, mean=colMeans(data),
cov=var(data),n=nrow(data))
    {
    norm.vec = function(x) x/sqrt(sum(x^2))
    p = length(mean)
    ev = eigen(cov)

    sphere = matrix(rnorm(npoints*p), ncol=p)
    cntr = t(apply(sphere, 1, norm.vec))  # normalized sphere

    cntr = cntr %*% diag(sqrt(ev$values)) %*% t(ev$vectors) # ellipsoid of
correct shape
    Fcrit = qf(cl, p, n-p)
    scalefactor = sqrt((p*(n-1))/(n*(n-p)))*Fcrit
    cntr = cntr*scalefactor # ellipsoid of correct size
    if (!missing(data)) # only relevant when no data passed
    colnames(cntr) = colnames(data)
    cntr+rep(mean, each=npoints)
    }

# Next a function that combines the original data with the generated
ellipsoid

combined = function(data, cl=0.95)
    {
    dm = data.matrix(data)
    ellipse = as.data.frame(ellipse(dm, npoints=1000, cl=cl))
    both = rbind(data, ellipse)
    both$SIM = factor(rep(c(FALSE,TRUE),c(nrow(data),1000)))
    both
    }

# Now a function to separate the dataset into categories

graphic_manova = function(data, catvar, cl=0.68)
    {
    sub.groups = data.frame(cbind(data,catvar))
    sub.groups = split(sub.groups,catvar)
    cis = lapply(sub.groups, combined, cl=cl)
    df = as.data.frame(do.call(rbind, cis))
    df$var = factor(rep(names(cis), sapply(cis, nrow)))
    g = ggobi(df)
    glyph_type(g[1]) = c(6,1)[df$SIM] # makes dots of ellipsoids tiny
    glyph_color(g[1]) = df$var # properly colors the two groups
    invisible(g)
    }

# Now actually do the computations & plot the data!

# ggobi(combined(response))  # This is a debugging check point

ggobi(graphic_manova(response,class))