# [BioC] plotting a CA

aedin culhane aedin at jimmy.harvard.edu
Fri Mar 9 18:49:51 CET 2012

```Hi Tim, Aoife and Susan

Sorry Tim, I didn't know that I said not to use made4. When did I say
this? I may have said I need to update some of the functions as I wrote
the made4 package many years ago.

Susan, made4 calls ade4 but is designed to convert microarray and other
Bioconductor data classes into formats that can be input into ade4. It
calls ade4 (and other) plot functions but with more sensible defaults
for genomics data (ie it doesn't label all of the objects!).  When I
implemented the package I did it with Guy and Jean who wrote the paper
you cited and I wholeheartedly agree with all you say ;-)

However Aoife your code plot(ca(table,suprow=c(4,5))) can't be used for
what you want.  This will plot rows 4 and 5 as supplementary plots onto
the plot. These points won't be used in the computation of the analysis
and thus would provide what you want.  Have a look at these plots

### --------------------------------------------
##  From here, you can copy/paste everything to R
##------------------------------------------------

## Your data... I renamed it, as table is a function in R

codonData <- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11,  8, 8, 10,
7),  ncol=3, dimnames = list(c("gene1","gene2", "gene3", "gene4",
"gene5"), c("codon1", "codon2","codon3")))

library(ca)
codonCA<-ca(codonData)

## Draw 2 plots, one with results of analysis of all the data,
# the other as you described

par(mfrow=c(1,2))
plot(ca(codonData,suprow=c(4,5)))
plot(codonCA)

## You will notice that the 2 plots are very different,
## one analysis is a CA of all 5 rows, the other is only 3 rows.

## To run a CA on a dataset using made4 or ade4, use the following code

## source("http://bioconductor.org/biocLite.R")

## example dataset
data(khan)
df<-khan\$train

## The function ord will run PCA, CA or NSC,
## by default it runs CA (by calling dudi.coa from ade4)

myCA<- ord(df)
plot(myCA)
plotgenes(myCA)
plotarrays(myCA)

codonCA<-dudi.coa(codonData, scan=FALSE)
scatter(codonCA)

## However neither of these will do exactly as you wish
## made4 expects groups in the column not the rows (genes x samples)

codonCA<-ord(t(codonData))

## Create a factor which list the groups of "nodes" of interest
fac<-factor(c(rep("Node1",3), rep("Node2", 2)))
fac
plot(codonCA, , classvec=fac)

## but the function below will do what you need.

plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE,
plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1,  yax = 2,  ...) {

fac2char<-function(fac, newLabels) {
cLab<- class(newLabels)
if (!length(levels(fac))==length(newLabels)) stop("Number does
not equal to number of factor levels")
vec<-as.character(factor(fac, labels=newLabels))
if(inherits(newLabels, "numeric")) vec<-as.numeric(vec)
return(vec)
}

if (plotgroups)  s.groups(dudi\$li, fac,  col=cols)
if (!plotgroups) {
pchs<-fac2char(rowFac, pch)
cols<-fac2char(rowFac, cols)

if (!plotrowLabels) s.var(dudi\$li, boxes=FALSE, pch=pchs, col=cols,
cpoint=2, clabel=0, xax=xax, yax=yax,  ...)
if (plotrowLabels)  s.var(dudi\$li, boxes=FALSE, col=cols,  xax=xax,
yax=yax,  ...)
}

s.var(dudi\$co, boxes=FALSE, pch=19, col="black", add.plot = TRUE,
xax=xax, yax=yax,  ...)
}

##--------------------------------------------
## Examples: Function has 3 different options
##-------------------------------------------

codonCA<-dudi.coa(codonData, scan=FALSE)

## Option 1, plot a biplot (cases and samples) with point
## colored by rowFAC

plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"))

## Option 2. Same plot as above, but with labels rather than points

plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"),
plotrowLabels=TRUE)

## Option 3, Same plot but put a circle around the groups
## If you look at the help page for s.groups (in made4)
## which calls s.class (in ade4) you will see you can also
## change the size and other details about the
## ellipse (or circle drawn around the groups)

plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue"))

On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty
<aoife.m.doherty at gmail.com>wrote:

> Many thanks. I tried this:
>
> table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11,
>    8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1",
>    "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2",
>    "codon3")))
>
> library(ca)
>
> plot(ca(table,suprow=c(4,5)))
>
> This will give me a ca plot, where the nodes of interest 4,5 are open
> circles.
>
> However i have two questions.
>
> 1. Is it possible instead of manually typing in 4 and 5 to somehow
get R to
> read in a list of nodes of interest. Basically is it possible to change:
>
> c(4,5) to c(all the nodes that are in a file)
>
> and
>
> 2. Is it possible instead of the individual nodes of interest being open
> circles, if the area encompassing all the nodes of interest could be
> differently/highlighted.
> i THINK this is where your suggestion of:
>
> using res=dudi.coa(data)
> then
> s.class(res\$li,group)
> where group is your grouping variable you want to highlight.
>
> comes in, but i am completely new at R, i have genuinely tried to
> understand the packages from the manual, I am confused however.
>
> Aoife
>
>
>
>
>

--
Aedin Culhane
Computational Biology and Functional Genomics Laboratory
Harvard School of Public Health,
Dana-Farber Cancer Institute

web: http://www.hsph.harvard.edu/research/aedin-culhane/
email: aedin at jimmy.harvard.edu
phone: +1 617 632 2468
Fax: +1 617 582 7760