[BioC] Limma toptable output using write.table and column names

Wed Feb 9 10:22:36 CET 2005

Hi Ken,

Ken Termiso wrote:
> I apologize in advance if this is confusing...
> 
> When I use write.exprs (which, as I understand makes a call to 
> write.table) to write expression data to a text file, the output text 
> file has one less column name (the probe ID column does not get a name), 
> and the other column names are shifted all the way to the left margin in 
> the text file. When this text file is read into R using the command 
> read.table(file="exprs.txt",header=TRUE), R converts the file into a 
> data frame, and correctly displays the row labels as probeset IDs.
> 
> (the spacing may be a little off here, depending on the display font, 
> but here you can see that the probeset name is the row label)
>          6187.CEL 6188.CEL 6189.CEL 6190.CEL 6191.CEL 6192.CEL
> 1007_s_at 8.779289 8.732751 8.822360 8.743272 8.768605 8.813886
> 1053_at   3.508310 3.389342 3.434458 3.410836 3.373940 3.387063
> 117_at    3.139897 3.105285 3.114203 3.131865 3.073855 3.038960
> 
> 
> However, with the limma toptables, each column has a name, including the 
> probeset column ("ID"). When I write a toptable to a textfile, and then 
> read it back into R, R thinks that the probeset IDs are a column of data 
> (since it is labelled with "ID"), and then adds row numbers to this data 
> frame. This makes it difficult to do other operations (at least in my 
> novice hands!!)
> 

When you read the toptable-textfile back into R, try setting the 
row.names-option of read.table:
read.table(file.txt, row.names=1, ...)
will use the first column of your textfile as rownames.

Hope that helps,
Julia

>> tt[1:3,]
> 
>         ID            M        A           t   P.Value         B
> 1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
> 2   1053_at -0.053423214 3.417325 -1.60706334 0.9999627 -5.499340
> 3    117_at -0.038235209 3.100678 -1.42248721 0.9999627 -5.724391
> 
> If I open up the toptable text file in excel, and delete the "ID" column 
> name and do not shift over the other ones, this is what happens:
> 
>> tt_spc[1:3,]
> 
>          X            M        A           t   P.Value         B
> 1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
> 2   1053_at -0.053423210 3.417325 -1.60706300 0.9999627 -5.499340
> 3    117_at -0.038235210 3.100678 -1.42248700 0.9999627 -5.724391
> 
> R silently appended an "X" to the "ID" column name..
> 
> 
> If I open the toptable file in excel, delete the "ID" column name, and 
> then shift the other column names over one all the way to the left, and 
> then open the text file in R it looks perfect:
> 
>> tt_shft[1:3,]
> 
>                 M        A       t   P.Value         B
> 1007_s_at -0.00288 8.776694 -0.0946 0.9999627 -6.721547
> 1053_at   -0.05340 3.417325 -1.6100 0.9999627 -5.499340
> 117_at    -0.03820 3.100678 -1.4200 0.9999627 -5.724391
> 
> 
> BUT, I don't want to have to edit each toptable file in excel before 
> re-opening it in R.
> 
> I also tried setting the column name to "", and also giving the toptable 
> data frame a string of names without the ID, but neither one worked...in 
> both cases R filled in an "NA" for the column name...
> 
> Is there any way for me to avoid having to edit the file in excel so 
> that I can write it to a text file, read it back into R, and have it 
> display the probeset names as the row labels???
> 
> I guess what I'm asking is this -- is there are way for me to modify the 
> toptable data frame so that the "ID" is removed and R uses the "ID" 
> column as the row labels??
> 
> Thanks in advance,
> -Ken
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> 
> 

-- 
--------------------------------------------------------------------

Julia Engelmann
Bioinformatics           Tel   ++49 (931) 888 - 4558
Am Hubland		 mail julia.engelmann at biozentrum.uni-wuerzburg.de
University of Wuerzburg
97074 Wuerzburg, Germany