Gavin Simpson gavin.simpson at ucl.ac.uk
Fri Apr 27 15:50:30 CEST 2007

On Fri, 2007-04-27 at 12:58 +0100, Simon Pickett wrote:
> Hi all,
> I have been using princomp() recently, its very useful indeed, but I have
> a question about how to specify the rows of data you want it to choose.
> I have a set of variables relating to bird characteristics and I have been
> using princomp to produce PC scores from these.
> However since I have multiple duplicate entries per individual (each bird
> had a varying number of chicks), I only want princomp to treat each
> individual bird as the sample and not include all the duplicates. Then I
> want to replicate the pc scores for all the duplicated rows for that
> individual.
> Any idea how to do this?

## example data using the vegan package
## duplicate some rows
vare2 <- varespec
vare2 <- rbind(vare2, varespec[sample(nrow(varespec), 50, replace =
TRUE), ])
## build the model using prcomp - it is better - on the original data
## without duplicates
mod <- prcomp(varespec, centre = TRUE, scale. = TRUE)
## predict for full matrix inc duplicated rows
pred <- predict(mod, vare2)

Takes 0.005 seconds on my machine. So get a subset of your data without
the duplicates, then use the predict method for prcomp.
See ?predict.prcomp.

Is that what you wanted?


> Up to now I have been using princomp to only select the entries which are
> not duplicated which is easy, but the difficult bit is the programming to
> duplicate the pc scores across the entries for each individual.
> (I developed something that worked but it takes about 5 minutes to run!)
> Thanks for all your help,
> very much appreciated,
> Simon.
