[BioC] Limma toptable output using write.table and column names
Julia Engelmann
julia.engelmann at biozentrum.uni-wuerzburg.de
Wed Feb 9 10:22:36 CET 2005
Hi Ken,
Ken Termiso wrote:
> I apologize in advance if this is confusing...
>
> When I use write.exprs (which, as I understand makes a call to
> write.table) to write expression data to a text file, the output text
> file has one less column name (the probe ID column does not get a name),
> and the other column names are shifted all the way to the left margin in
> the text file. When this text file is read into R using the command
> read.table(file="exprs.txt",header=TRUE), R converts the file into a
> data frame, and correctly displays the row labels as probeset IDs.
>
> (the spacing may be a little off here, depending on the display font,
> but here you can see that the probeset name is the row label)
> 6187.CEL 6188.CEL 6189.CEL 6190.CEL 6191.CEL 6192.CEL
> 1007_s_at 8.779289 8.732751 8.822360 8.743272 8.768605 8.813886
> 1053_at 3.508310 3.389342 3.434458 3.410836 3.373940 3.387063
> 117_at 3.139897 3.105285 3.114203 3.131865 3.073855 3.038960
>
>
> However, with the limma toptables, each column has a name, including the
> probeset column ("ID"). When I write a toptable to a textfile, and then
> read it back into R, R thinks that the probeset IDs are a column of data
> (since it is labelled with "ID"), and then adds row numbers to this data
> frame. This makes it difficult to do other operations (at least in my
> novice hands!!)
>
When you read the toptable-textfile back into R, try setting the
row.names-option of read.table:
read.table(file.txt, row.names=1, ...)
will use the first column of your textfile as rownames.
Hope that helps,
Julia
>> tt[1:3,]
>
> ID M A t P.Value B
> 1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
> 2 1053_at -0.053423214 3.417325 -1.60706334 0.9999627 -5.499340
> 3 117_at -0.038235209 3.100678 -1.42248721 0.9999627 -5.724391
>
> If I open up the toptable text file in excel, and delete the "ID" column
> name and do not shift over the other ones, this is what happens:
>
>> tt_spc[1:3,]
>
> X M A t P.Value B
> 1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
> 2 1053_at -0.053423210 3.417325 -1.60706300 0.9999627 -5.499340
> 3 117_at -0.038235210 3.100678 -1.42248700 0.9999627 -5.724391
>
> R silently appended an "X" to the "ID" column name..
>
>
> If I open the toptable file in excel, delete the "ID" column name, and
> then shift the other column names over one all the way to the left, and
> then open the text file in R it looks perfect:
>
>> tt_shft[1:3,]
>
> M A t P.Value B
> 1007_s_at -0.00288 8.776694 -0.0946 0.9999627 -6.721547
> 1053_at -0.05340 3.417325 -1.6100 0.9999627 -5.499340
> 117_at -0.03820 3.100678 -1.4200 0.9999627 -5.724391
>
>
> BUT, I don't want to have to edit each toptable file in excel before
> re-opening it in R.
>
> I also tried setting the column name to "", and also giving the toptable
> data frame a string of names without the ID, but neither one worked...in
> both cases R filled in an "NA" for the column name...
>
> Is there any way for me to avoid having to edit the file in excel so
> that I can write it to a text file, read it back into R, and have it
> display the probeset names as the row labels???
>
> I guess what I'm asking is this -- is there are way for me to modify the
> toptable data frame so that the "ID" is removed and R uses the "ID"
> column as the row labels??
>
> Thanks in advance,
> -Ken
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
--
--------------------------------------------------------------------
Julia Engelmann
Bioinformatics Tel ++49 (931) 888 - 4558
Am Hubland mail julia.engelmann at biozentrum.uni-wuerzburg.de
University of Wuerzburg
97074 Wuerzburg, Germany
More information about the Bioconductor
mailing list