[BioC] Limma toptable output using write.table and column names

Wed Feb 9 20:23:42 CET 2005

Excellent. Thanks for your help.

What I did was :

>tt <- data.frame(tt,row.names=tt$ID) #make row names probeset IDs
>tt$ID <- NULL #to get rid of the ID column (since it is now redundant)

This produced output text files that were read back into R as intended.

>write.table(tt, file="tt", row.names = TRUE, col.names = TRUE, sep ="\t")
>tp <- read.table(file="tt",header=TRUE)
>tp[1:3,]
                     M        A           t   P.Value         B
1007_s_at -0.002879009 8.776694 -0.09459093 0.9417878 -6.721547
1053_at   -0.053423214 3.417325 -1.60706334 0.3285045 -5.499340
117_at    -0.038235209 3.100678 -1.42248721 0.3308744 -5.724391

>From: Julia Engelmann <julia.engelmann at biozentrum.uni-wuerzburg.de>
>To: Ken Termiso <jerk_alert at hotmail.com>
>CC: bioconductor at stat.math.ethz.ch
>Subject: Re: [BioC] Limma toptable output using write.table and column 
>names
>Date: Wed, 09 Feb 2005 10:22:36 +0100
>
>Hi Ken,
>
>
>Ken Termiso wrote:
>>I apologize in advance if this is confusing...
>>
>>When I use write.exprs (which, as I understand makes a call to 
>>write.table) to write expression data to a text file, the output text file 
>>has one less column name (the probe ID column does not get a name), and 
>>the other column names are shifted all the way to the left margin in the 
>>text file. When this text file is read into R using the command 
>>read.table(file="exprs.txt",header=TRUE), R converts the file into a data 
>>frame, and correctly displays the row labels as probeset IDs.
>>
>>(the spacing may be a little off here, depending on the display font, but 
>>here you can see that the probeset name is the row label)
>>          6187.CEL 6188.CEL 6189.CEL 6190.CEL 6191.CEL 6192.CEL
>>1007_s_at 8.779289 8.732751 8.822360 8.743272 8.768605 8.813886
>>1053_at   3.508310 3.389342 3.434458 3.410836 3.373940 3.387063
>>117_at    3.139897 3.105285 3.114203 3.131865 3.073855 3.038960
>>
>>
>>However, with the limma toptables, each column has a name, including the 
>>probeset column ("ID"). When I write a toptable to a textfile, and then 
>>read it back into R, R thinks that the probeset IDs are a column of data 
>>(since it is labelled with "ID"), and then adds row numbers to this data 
>>frame. This makes it difficult to do other operations (at least in my 
>>novice hands!!)
>>
>
>When you read the toptable-textfile back into R, try setting the 
>row.names-option of read.table:
>read.table(file.txt, row.names=1, ...)
>will use the first column of your textfile as rownames.
>
>Hope that helps,
>Julia
>
>>>tt[1:3,]
>>
>>         ID            M        A           t   P.Value         B
>>1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
>>2   1053_at -0.053423214 3.417325 -1.60706334 0.9999627 -5.499340
>>3    117_at -0.038235209 3.100678 -1.42248721 0.9999627 -5.724391
>>
>>If I open up the toptable text file in excel, and delete the "ID" column 
>>name and do not shift over the other ones, this is what happens:
>>
>>>tt_spc[1:3,]
>>
>>          X            M        A           t   P.Value         B
>>1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
>>2   1053_at -0.053423210 3.417325 -1.60706300 0.9999627 -5.499340
>>3    117_at -0.038235210 3.100678 -1.42248700 0.9999627 -5.724391
>>
>>R silently appended an "X" to the "ID" column name..
>>
>>
>>If I open the toptable file in excel, delete the "ID" column name, and 
>>then shift the other column names over one all the way to the left, and 
>>then open the text file in R it looks perfect:
>>
>>>tt_shft[1:3,]
>>
>>                 M        A       t   P.Value         B
>>1007_s_at -0.00288 8.776694 -0.0946 0.9999627 -6.721547
>>1053_at   -0.05340 3.417325 -1.6100 0.9999627 -5.499340
>>117_at    -0.03820 3.100678 -1.4200 0.9999627 -5.724391
>>
>>
>>BUT, I don't want to have to edit each toptable file in excel before 
>>re-opening it in R.
>>
>>I also tried setting the column name to "", and also giving the toptable 
>>data frame a string of names without the ID, but neither one worked...in 
>>both cases R filled in an "NA" for the column name...
>>
>>Is there any way for me to avoid having to edit the file in excel so that 
>>I can write it to a text file, read it back into R, and have it display 
>>the probeset names as the row labels???
>>
>>I guess what I'm asking is this -- is there are way for me to modify the 
>>toptable data frame so that the "ID" is removed and R uses the "ID" column 
>>as the row labels??
>>
>>Thanks in advance,
>>-Ken
>>
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>
>>
>
>--
>--------------------------------------------------------------------
>
>Julia Engelmann
>Bioinformatics           Tel   ++49 (931) 888 - 4558
>Am Hubland		 mail julia.engelmann at biozentrum.uni-wuerzburg.de
>University of Wuerzburg
>97074 Wuerzburg, Germany