Ken
vicvoncastle at gmail.com
Mon Nov 28 15:05:19 CET 2011
R distance objects are triangular, maybe consider as.dist() that would require the square matrix as input. Which could be reconstructed(or you have it already.) I do not know if there is a biglm() alternative to princomp(), but maybe consider using subsets of your data because that plot, if created, is going to be very hectic.
HTH
Ken Hutchison
On Nov 28, 2554 BE, at 5:55 AM, cm <mbnchakravarthy at gmail.com> wrote:
> Hi,
>
> I have a comma separated file with element names in first column like shown
> below :
>
> Name_1,0
> Name_2,0.8878,0
> Name_3,0.6777,0.7643,0
> Name_4,0.9844,0.1234,0.1414,0
>
> Original data is a 10000x10000 symmetric matrix (600 MB). To reduce file
> size, I have minimized matrix to only lower triangle. Is there a (memory)
> efficient way to 1) read file 2) compute first and second principal
> components and 3) and plot first vs second PC's ?
>
> In the past, I could do this by :
> b <- read.csv("distance.csv", sep=",", head=F) # distance.csv file is
> complete data matrix, so this command worked !!
> my_matrix <- data.matrix(b)
> pca2 <- princomp(my_matrix)
> plot(pca2$scores[,1],pca2$scores[,2])
> text(pca2$scores[,1],pca2$scores[,2],rownames(nba_matrix), cex=0.5, pos=1)
>
> This time, I don't have a complete file. So, I was wondering, how to do this
> ?
>
> Any help is much appreciated
>
> TIA
> M
>
