[R] prcomp - surprising structure

Hermann Norpois hnorpois at gmail.com
Thu Oct 3 11:41:16 CEST 2013


Hello,

I did a pca with over 200000 snps for 340 observations (ids). If I plot the
eigenvectors (called rotation in prcomp) 2,3 and 4 (e.g. plot
(rotation[,2]) I see a strange "column" in my data (see attachment). I
suggest it is an artefact (but of what?).

Suggestion:
I used prcomp this way: prcomp (mat), where mat is a matrix with the column
means already substracted followed by a normalisation procedure (see below
for details). Is that okay? Or does prcomp repeat substraction steps?

Originally my approach was driven by the idea to compute a covariation
matrix followed by the use of eigen, but the covariation matrix was to huge
to handle. So I switched to prcomp.

As I guess that the "columns" in my plots reflect some artefact production
I hope to get some help. For the case that my use of prcomp was not okay,
could you please give me instructions how to use it - including with the
normalisation procedure that I need to include before doing a pca.

Thanks
Hermann

#
# mat: matrix with genotypes coded as 0,1 and 2 (columns); IDs
(observations) as rows.
#
prcomp.snp <- function (mat)
  {
    m <- ncol (mat)
    n <- nrow (mat)
    snp.namen <- colnames (mat)
    for (i in 1:m)
                   {
                     # snps in columns
                     ui <- mat[,i]
                     n <- length (which (!is.na(ui)))
                     # see methods Price et al. as correction
                     pi <- (1+ sum(ui, na.rm=TRUE))/(2+2*n)

                     # substract mean
                     ui <- ui - mean (ui, na.rm=TRUE)
                     # NAs set to zero
                     ui[is.na(ui)] <- 0
                     # normalisation of the genotype for each ID
important normalisation step
                     ui <- ui/ (sqrt (pi*(1-pi)))
                     # fill matrix with ui
                     mat[,i] <- ui
                   }
    mat <- prcomp (mat)
    return (mat)
   }
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rotplot.png
Type: image/png
Size: 17486 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20131003/04b3d9e5/attachment.png>


More information about the R-help mailing list