[R] Different labels for subsets of points in a PCA or RDA biplot
David Hewitt
dhewitt37 at gmail.com
Fri Feb 13 00:26:51 CET 2009
Thanks Gavin. Your advice and a little digging got me what I needed.
Sorry for not specifying the correct function RE: 'labels' -- I was
referring to text() as described in the ordiplot help page.
Anyway, here's a code example of what I wanted to do. Probably not the
most elegant solution, but it works.
##################
library(vegan)
junk <-
structure(list(IDer=structure(1:26, .Label=c("A", "B", "C",
"D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N",
"O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y",
"Z"), class="factor"), AA=c(1.2, 0.8, 2.1, 1.1, 1.2,
1.1, 1.6, 1.8, 1.5, 1, 2.3, 1.2, 1.5, 2.7, 1.2, 1.9,
0.7, 1.8, 1.4, 0.5, 3.3, 1.6, 2.2, 2.5, 1.1, 1.6),
BB=c(0.0031, 0.0058, 0.0281, 0.006, 0.0023, 0.0081,
0.011, 0.0185, 0.0029, 0.0019, 0.0059, 0.0101, 0.0064,
0.0254, 0.0304, 0.0115, 0.0032, 0.0211, 0.0222, 0.0016,
0.0471, 0.0088, 0.0267, 0.0707, 0.0043, 0.0044),
CC=c(368693L, 80008L, 33980L, 22446L, 277207L, 291783L,
66376L, 137225L, 25908L, 547213L, 172857L, 219773L,
10779L, 173601L, 324056L, 78370L, 295640L, 236235L,
146339L, 77554L, 8019L, 143727L, 253631L, 112779L,
9976L, 442107L),
DD=c(3.1, 1.1, 0.6, 0.3, 3.1, 4, 0.9, 1.8, 0.4, 5, 3.3,
2.5, 0.2, 3.4, 3.8, 1.1, 2.5, 3.4, 2.1, 0.4, 0.2, 2.3,
4.1, 2.1, 0.1, 3.6)),
.Names=c("IDer", "AA", "BB", "CC", "DD"),
class="data.frame", row.names=c(NA, -26L))
ord <- rda(junk[, 2:5], scale=TRUE)
# Grab the scores on the first two axes
items.scr <- scores(ord, display="sites", scaling=3)
vars.scr <- scores(ord, display="species", scaling=3)
# Associate the item scores with their labels
row.names(items.scr) <- junk$IDer
# Make a subset variable for 2 special groups of items
junk$awesome <- rep(0, 26)
awesomelist1 <- c("A", "J", "K", "L", "S", "U", "Z")
awesomelist2 <- c("B", "C", "V", "W", "X", "Y")
junk$awesome[junk$IDer %in% awesomelist1] <- 1
junk$awesome[junk$IDer %in% awesomelist2] <- 2
# Set x,y limits
xlims <- c(-1.7, 1.4)
ylims <- c(-0.4, 1.3)
# Plot without points
oldpar <- par(mar=c(4, 4, 1, 1))
biplot.rda(ord, scaling=3, display="species", type="text",
col="lightblue", xlim=xlims, ylim=ylims)
# Add labels for the 3 subsets
text(items.scr[junk$awesome==0, ],
labels=row.names(items.scr[junk$awesome==0, ]),
col="lightgray")
text(items.scr[junk$awesome==1, ],
labels=row.names(items.scr[junk$awesome==1, ]),
col="black")
text(items.scr[junk$awesome==2, ],
labels=row.names(items.scr[junk$awesome==2, ]),
col="darkblue", cex=1.3)
par(oldpar)
>> On Thu, 2009-02-12 at 09:42 -0800, David Hewitt wrote:
>> I've tried a few things both with prcomp(), and rda() and
>> its friends in vegan (including biplot.rda and ordiplot),
>> but can't find a solution. I'd like to associate subsets of
>> the points in a resulting biplot ("sites" in the rda object)
>> with different plotting colors/text styles to emphasize
>> certain sets of points. I can't figure out how to
>> keep the arrows (for "species" in the rda object) and
>> then pass sets of points and their labels separately.
>> A subproblem is finding out how to associate a vector
>> of text labels with the coordinates in the rda object.
>> I had no problem doing this with 'xlabs' in biplot.prcomp(),
>> but can't get 'labels' to recognize such in ordiplot functions.
>>
> On Thu, Feb 12, 2009 at 10:28 AM, Gavin Simpson
> <gavin.simpson at ucl.ac.uk> wrote:
> hi David,
>
> Currently most of vegan's plotting functions don't
> recognise vector arguments for colours/symbols etc.
> This is something I have been meaning to set aside
> time for but other work has taken precedence lately.
>
> Vegan does however provide you with all the tools to do
> this yourself. I'm about to head out of the office, but the
> general idea would be:
>
> ## continue example from ?biplot.rda
> ## produces 'mod'
> example(biplot.rda)
> ## grab the scores on axes you require;
> mod == your fitted rda/pca
> site.scr <- scores(mod, display = "sites", scaling = 3)
> spp.scr <- scores(mod, display = "species", scaling = 3)
>
> ## generate x,y lims
> xlims <- range(site.scr[,1], spp.scr[,1])
> ylims <- range(site.scr[,2], spp.scr[,2])
>
> ## plot what you want, but leave out sites
> biplot(mod, display = "species", xlim = xlims, ylim = ylims,
> scaling = 3)
>
> ## now add in scores as you want, to simulate we plot:
> ## rows 1:10 of dune in red, and
> ## rows 11:20 in blue.
> points(site.scr[1:10,], col = "red", pch = 2)
> points(site.scr[11:20,], col = "blue", pch = 3)
>
> The coordinates you are looking for are the things
> returned by scores() (with the relevant scaling applied).
> There is no argument 'labels' to ordiplot however, so
> perhaps you could provide an example of what you
> are doing as per the posting guide and I will try to help you.
More information about the R-help
mailing list