[R] .gct file

Marc Schwartz (via MN) mschwartz at mn.rr.com
Tue Jul 19 20:08:15 CEST 2005


For the TAB delimited columns, adjust the 'sep' argument to:

read.table("data.gct", skip = 2, header = TRUE, sep = "\t")

The 'quote' argument is by default:

quote = "\"'"

which should take care of the quoted strings and bring them in as a
single value.

The above presumes that the header row is also TAB delimited. If not,
you may have to set 'skip = 3' to skip over the header row and manually
set the column names.

HTH,

Marc Schwartz


On Tue, 2005-07-19 at 13:52 -0400, mark salsburg wrote:
> This is all extremely helpful.
> 
> The data turns out is a little atypical, the columns are tab-delemited
> except for the description columns
> 
> 
> DATA1.gct looks like this
> 
> #1.2
> 23 3423
> NAME DESCRIPTION VALUE
> gene1 "a protein inducer" 1123
> .....          .................     ......
> 
> How do I get R to read the data as tab delemited, but read in the 2nd
> coloumn as one value based on the quotation marks..
> 
> thanks..
> 
> On 7/19/05, Marc Schwartz (via MN) <mschwartz at mn.rr.com> wrote:
> > On Tue, 2005-07-19 at 13:16 -0400, mark salsburg wrote:
> > > ok so the gct file looks like this:
> > >
> > > #1.2  (version number)
> > > 7283 19   (matrix size)
> > > Name Description Values
> > > ....      .......          ......
> > >
> > > How can I tell R to disregard the first two lines and start reading
> > > the 3rd line in this gct file. I would just delete them, but I do not
> > > know how to open a gct. file
> > >
> > > thank you
> > >
> > > On 7/19/05, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> > > > On 7/19/2005 12:10 PM, mark salsburg wrote:
> > > > > I have two files to compare, one is a regular txt file that I can read
> > > > > in no prob.
> > > > >
> > > > > The other is a .gct file (How do I read in this one?)
> > > > >
> > > > > I tried a simple
> > > > >
> > > > > read.table("data.gct", header = T)
> > > > >
> > > > > How do you suggest reading in this file??
> > > > >
> > > >
> > > > .gct is not a standard filename extension.  You need to know what is in
> > > > that file.  Where did you get it?  What program created it?
> > > >
> > > > Chances are the easiest thing to do is to get the program that created
> > > > it to export in a well known format, e.g. .csv.
> > > >
> > > > Duncan Murdoch
> > 
> > 
> > The above would be consistent with the info in my reply.
> > 
> > I guess if the format is consistent, as per Mark's example above, you
> > can use:
> > 
> > read.table("data.gct", skip = 2, header = TRUE)
> > 
> > which will start by skipping the first two lines and then reading in the
> > header row and then the data.
> > 
> > See ?read.table
> > 
> > HTH,
> > 
> > Marc Schwartz
> > 
> > 
> >




More information about the R-help mailing list