[R] extracting rows and columns from a big matrix

arun smartpink111 at yahoo.com
Mon Jul 16 18:08:15 CEST 2012


Hi,

If you think that R may not be able to subset greater than 300X300,

Try this:
m<-matrix(numeric(350*2000),ncol=2000)
  colnames(m)<-paste("X",1:2000,sep="")
 rownames(m)<-paste("X",1:350,sep="")
  m[c("X6","X20","X151","X180"),c("X25","X150","X1500","X1750")]
     X25 X150 X1500 X1750
X6     0    0     0     0
X20    0    0     0     0
X151   0    0     0     0
X180   0    0     0     0



#So, I guess there might be some problems in your dataset.  

A.K.





________________________________
From: A J <anxusgo at hotmail.com>
To: smartpink111 at yahoo.com 
Sent: Monday, July 16, 2012 10:37 AM
Subject: RE: [R] extracting rows and columns from a big matrix



Hello again!


Sorry for the inconvenience and thanks to everybody trying to help me. The steps I followed in the proccess are thoses:


1) Open the .csv file containing the large matrix (15000 rows x 15000 columns) using write.table
2) If I try to subset the total number of columns and/or rows that I need, just 1788 ones (resulting a new square submatrix), R don't permit to do it and at the end of the console return the "+" sign
3) In order to check there is no mistakes I have copied labels from the .csv file and I have compared with the original database comprising all the data. There is no mistakes.
4) After seeing all data in Excel I decided to split the number of columns that I need to subset it in different parts. Developing several tests, I have checked that R works if the number of columns that I require is not higher than 300 (maybe a little bit higher, but I don't want to waste time executing so many tests).
5) I have thought the best solution maybe to divide data in different groups of around 300 rows x 300 colums submatrices and then, join them using, for instance, "merge" function to get the final square submatrix of 1788 x 1788.


I think all is really strange, but I have developed several tests and different methods and I can't find a good and consistent argument. Perhaps label length have some connection, but I am not sure. I will inform about the results.


Greetings and thanks again


AJ


> Date: Mon, 16 Jul 2012 07:05:44 -0700
> From: smartpink111 at yahoo.com
> Subject: Re: [R] extracting rows and columns from a big matrix
> To: anxusgo at hotmail.com
> 
> Hello AJ,
> 
> If I understand your email, there is no problem in subsetting n (say 300 or 400) number of columns from the  1st and 2nd splitted ones (447 columns).  Try saving the third and fourth set using write.csv and open it in excel to see for any anomalies.  How did you split the files?  Is it after reading it in R?
> 
> 
> A.K.
> 
> 
> 
> 
> ________________________________
> From: A J <anxusgo at hotmail.com>
> To: smartpink111 at yahoo.com 
> Cc: r-help at r-project.org 
> Sent: Monday, July 16, 2012 9:10 AM
> Subject: RE: [R] extracting rows and columns from a big matrix
> 
> 
> 
> Yes, I have tried it and this works.
> 
> 
> Indeed, if I use a small number of colums, all the methods proposed here are working. Following the previous mail I have splited the number of colums in 4 parts of 447 colums each one. The first and the second ones work weel, but this doesn't occur with third and fourth parts. I am convinced it's not a problem with quotes, because I tried to remove them, and again the code in first and second part worked well. Now I have copied all labels directly from original matrix in txt file not to have any other mistakes. I will inform you about the enigmatic problem when I find it (I hope so...).
> 
> 
> Thanks for your comments and help.
> 
> 
> AJ
> 
> 
> 
> 
> > Date: Mon, 16 Jul 2012 05:46:46 -0700
> > From: smartpink111 at yahoo.com
> > Subject: Re: [R] extracting rows and columns from a big matrix
> > To: anxusgo at hotmail.com
> > CC: r-help at r-project.org
> > 
> > Hello,
> > 
> > Have you tried subsetting smaller number of columns (say 5 or 6) from the 2000 column dataset?  If it is not working, then there might be problems in reading the dataset.
> > 
> > A.K.
> > 
> > 
> > 
> > 
> > ________________________________
> > From: A J <anxusgo at hotmail.com>
> > To: smartpink111 at yahoo.com 
> > Cc: r-help at r-project.org 
> > Sent: Monday, July 16, 2012 6:49 AM
> > Subject: RE: [R] extracting rows and columns from a big matrix
> > 
> > 
> > 
> > Thank you very much to everybody for your fast respones.
> > 
> > 
> > All your solutions are working well, but I keep with the same problem. When I use whatever of your proposals with a small set of colums (and/or rows), this work, but when I use the whole set of columns (and/or rows) comprising around 2000 columns, the system don't return me the submatrix specified and prompt sign ">" is replaced by "+" one at the end of the console. May this be due to a limitation in subsetting matrices?
> > 
> > 
> > This is an example code working and using only columns:
> > 
> > 
> > m<-read.table("C:/backup/Rfiles/sym_matrix_cos.csv", header=T)
> > 
> > 
> > o<-as.matrix(m[(select=c("X12002", "X12027", "X12054", "X12084", "X12085", "X12115", "X12129", "X12139", "X12195", "X12223", "X12295", "X12327", "X12356", "X12474", "X12487", "X12491", "X12520", "X12570", "X12600", "X12616", "X12626", "X12629", "X12634", "X12669", "X12685", "X12748", "X12759", "X12766", "X12789", "X12793", "X12814", "X12824", "X12892", "X12897", "X12909", "X12932", "X12959", "X12995", "X13018", "X13039", "X13134", "X13138", "X13162", "X13173", "X13236", "X13243", "X13351", "X13410", "X13452", "X13474", "X13475", "X13486", "X13518", "X13574", "X13586", "X13588"))])
> > 
> > >
> > 
> > 
> > However, when I use the same code introducing the total number of columns (around 2000) it's not working.
> > 
> > 
> > I have checked all  labels several times in order not to commit mistakes. For this reason I have copied and pasted all labels from a database to a spreadsheet where I have added all quotes dragging them from the first cell to last one (not to miss quotes). Really I don't have any idea about the reason which R permits to apply this code taking 56 columns (as in example above) and doesn't permit to do it taking around 2000 columns. If you have any suggestions, please, let me know.
> > 
> > 
> > Thanks to everybody again.
> > 
> > 
> > Best,
> > 
> > 
> > AJ
> > 
> > 
> > 
> > > Date: Sun, 15 Jul 2012 19:09:05 -0700
> > > From: smartpink111 at yahoo.com
> > > Subject: Re: [R] extracting rows and columns from a big matrix
> > > To: anxusgo at hotmail.com
> > > CC: r-help at r-project.org
> > > 
> > > Hello,
> > > 
> > > In my previous email, I used index to subset the data.  Then, I looked at your code.  I guess you wanted to try the "subset" function to get the same output.
> > > 
> > > Try this:
> > > dat1<-read.table(text="
> > >   X1 X7 X12 X15 X22 X26 X31 X34 X39 X44 X51
> > > X1  1  2   3   4  5  6  7  8  9 10  11
> > > X7  11  9  7  5   3  1 10 8 6  4  2
> > > X12 3  4  7  8  5   7  2  9  1  3  2
> > > X15 9  9  8  4  7  1   1  3  2  5  3
> > > X22 6  7  7  4  4  2  9  8  8  1  1
> > > X26 3  9  4  8  5  7  6  1  2  3  8
> > > X31 1  2  1  3  1  4  1  5  1  6  1
> > > X34 6  7  8  5  2  9  5  1  6  8  9
> > > X39 4  8  7  4  6  5  1  9  2  7  5
> > > X44 2  2  2  8  6  7  9  5  3  7  7
> > > X51 9  9  9  6  6  4  8  7  2  1  3
> > > ",sep="", header=TRUE)
> > > 
> > > subset(dat1,subset=row.names(dat1)%in% c("X1","X12","X22","X31"),select=c("X1","X12","X22","X31"))
> > >     X1 X12 X22 X31
> > > X1   1   3   5   7
> > > X12  3   7   5   2
> > > X22  6   7   4   9
> > > X31  1   1   1   1
> > > 
> > > A.K.
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > ----- Original Message -----
> > > From: A J <anxusgo at hotmail.com>
> > > To: jholtman at gmail.com
> > > Cc: r-help at r-project.org
> > > Sent: Sunday, July 15, 2012 3:43 PM
> > > Subject: Re: [R] extracting rows and columns from a big matrix
> > > 
> > > 
> > > Sorry so much for mistakes. 
> > > 
> > > It was an example code and I commited some mistakes typing it. But meaning the original code is right (I have checked several times) I am not sure about how to solve the problem of extracting columns and rows using labels from a squared matrix. I have enclosed a text file with the idea in order to understand it better.
> > > 
> > > Thanks again, and sorry for the inconvenience.
> > > 
> > > Best,
> > > 
> > > AJ
> > > 
> > > 
> > > 
> > > > Date: Sun, 15 Jul 2012 14:53:47 -0400
> > > > Subject: Re: [R] extracting rows and columns from a big matrix
> > > > From: jholtman at gmail.com
> > > > To: anxusgo at hotmail.com
> > > > CC: r-help at r-project.org
> > > > 
> > > > For a start, you are missing a quote and a parenthese on the
> > > > statement; probably should be: (another quote was also missing)
> > > > 
> > > > n<-subset(m, select=c("X1", "X7", "X12","X15", "X22", "X26", "X31",
> > > > "X34", "X39", "X44", "X51", "X58"))
> > > > 
> > > > Not sure what you want with the rownames; an example would help and
> > > > post with 'dput'.
> > > > 
> > > > On Sun, Jul 15, 2012 at 2:47 PM, A J <anxusgo at hotmail.com> wrote:
> > > > >
> > > > > Hi there and thanks in advance.
> > > > >
> > > > > I have a large symmetrical matrix stored in a text file. After load in R I would like to extract the same number of columns and rows (symmetrical submatrix) using their labels.
> > > > >
> > > > > I have tried this code in order to extract columns, but R console gives me the "+" sign at the end of the code, pointing out incomplete command, so it is not working:
> > > > >
> > > > > m<-read.table("C:/backup/symmetrical.csv")
> > > > >
> > > > > n<-subset(m, select=c("X1", "X7", "X12", X15", "X22", "X26", "X31", "X34", "X39", "X44", "x51", "X58)
> > > > >
> > > > > Therefore, I have no tried with row names yet.
> > > > >
> > > > > Any suggestions? Sorry for the inconvenience. I have read some information about this but always have the same problem with "+" and I do not have any idea to follow.
> > > > >
> > > > > Best,
> > > > >
> > > > > AJ
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >         [[alternative HTML version deleted]]
> > > > >
> > > > > ______________________________________________
> > > > > R-help at r-project.org mailing list
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > > > and provide commented, minimal, self-contained, reproducible code.
> > > > 
> > > > 
> > > > 
> > > > -- 
> > > > Jim Holtman
> > > > Data Munger Guru
> > > > 
> > > > What is the problem that you are trying to solve?
> > > > Tell me what you want to do, not how you want to do it.
> > >                           
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >



More information about the R-help mailing list