[BioC] Limma toptable output using write.table and column names
Ken Termiso
jerk_alert at hotmail.com
Wed Feb 9 01:17:10 CET 2005
I apologize in advance if this is confusing...
When I use write.exprs (which, as I understand makes a call to write.table)
to write expression data to a text file, the output text file has one less
column name (the probe ID column does not get a name), and the other column
names are shifted all the way to the left margin in the text file. When this
text file is read into R using the command
read.table(file="exprs.txt",header=TRUE), R converts the file into a data
frame, and correctly displays the row labels as probeset IDs.
(the spacing may be a little off here, depending on the display font, but
here you can see that the probeset name is the row label)
6187.CEL 6188.CEL 6189.CEL 6190.CEL 6191.CEL 6192.CEL
1007_s_at 8.779289 8.732751 8.822360 8.743272 8.768605 8.813886
1053_at 3.508310 3.389342 3.434458 3.410836 3.373940 3.387063
117_at 3.139897 3.105285 3.114203 3.131865 3.073855 3.038960
However, with the limma toptables, each column has a name, including the
probeset column ("ID"). When I write a toptable to a textfile, and then read
it back into R, R thinks that the probeset IDs are a column of data (since
it is labelled with "ID"), and then adds row numbers to this data frame.
This makes it difficult to do other operations (at least in my novice
hands!!)
>tt[1:3,]
ID M A t P.Value B
1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
2 1053_at -0.053423214 3.417325 -1.60706334 0.9999627 -5.499340
3 117_at -0.038235209 3.100678 -1.42248721 0.9999627 -5.724391
If I open up the toptable text file in excel, and delete the "ID" column
name and do not shift over the other ones, this is what happens:
>tt_spc[1:3,]
X M A t P.Value B
1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
2 1053_at -0.053423210 3.417325 -1.60706300 0.9999627 -5.499340
3 117_at -0.038235210 3.100678 -1.42248700 0.9999627 -5.724391
R silently appended an "X" to the "ID" column name..
If I open the toptable file in excel, delete the "ID" column name, and then
shift the other column names over one all the way to the left, and then open
the text file in R it looks perfect:
>tt_shft[1:3,]
M A t P.Value B
1007_s_at -0.00288 8.776694 -0.0946 0.9999627 -6.721547
1053_at -0.05340 3.417325 -1.6100 0.9999627 -5.499340
117_at -0.03820 3.100678 -1.4200 0.9999627 -5.724391
BUT, I don't want to have to edit each toptable file in excel before
re-opening it in R.
I also tried setting the column name to "", and also giving the toptable
data frame a string of names without the ID, but neither one worked...in
both cases R filled in an "NA" for the column name...
Is there any way for me to avoid having to edit the file in excel so that I
can write it to a text file, read it back into R, and have it display the
probeset names as the row labels???
I guess what I'm asking is this -- is there are way for me to modify the
toptable data frame so that the "ID" is removed and R uses the "ID" column
as the row labels??
Thanks in advance,
-Ken
More information about the Bioconductor
mailing list