[R] Split column of concatenated data

Street N.R. N.R.STREET at soton.ac.uk
Wed Apr 25 16:06:54 CEST 2007


I have a column of concatenated information stored in an RG object in the limma package and I need to split this information and then paste the first two pieces of data in each case back into two columns of the RG object.

This is how I am currently doing this


for (h in 1: length(gene.info.split)){

However, this is very slow and presumably 'messy'. The problem is that there are an inconsistent number of comma separated entries in the original Name column so I cannot do


because I get the error message 

Error in data.frame(c("OligoCy3", "SP Control poplar 48pin", "A24", "no length information" : 
        arguments imply differing number of rows: 4, 6, 5, 1

I also can't figure out how to usefully put the [list] data into a matrix (my ignorance I am sure).

Ideally I would be able to put each comma separated item into a column and then simply paste the first and second columns over the RG$genes$Name and RG$genes$ID columns respectively (and do away with the for loop).

Some cases in the original RG$genes$Name has only one piece of information (ie no commas) so I would need a way to fill any blanks with an NA value

If anyone can help me, it would be much appreciated

Nat Street

Nathaniel Street
         University of Southampton
         Plants and Environment Lab
      School of Biological Sciences
   Basset Crescent East
 SO16 7PX
       tel: +44 (0) 2380 594268
  fax: +44 (0) 2380 594269
 n.r.street at soton.ac.uk


More information about the R-help mailing list