[R] Problem passing data into read.table()

Huntsinger, Reid reid_huntsinger at merck.com
Mon Apr 22 21:02:31 CEST 2002

I think you should avoid read.table here. It tries to set various parameters
(number of columns, etc) by looking at a part of your file. Even if you
specify col.names, it can get confused by this looking-ahead. Better look at
the definition of read.table in terms of scan and readLines and modify to
your needs.

Perhaps your problem isn't that nrows doesn't get the right value, but that
read.table is using the wrong number of columns. (Actually, I can't create
an example like yours that read.table doesn't complain about line 1 not
having n elements, where n is the number of elements in the longest line.)

If you really want to make read.table work, you can probably use the
fill=TRUE option and lots of caution.

Reid Huntsinger

-----Original Message-----
From: David R. McWillliams [mailto:dmcwilli at utk.edu]
Sent: Monday, April 22, 2002 1:13 PM
To: r-help at stat.math.ethz.ch
Subject: [R] Problem passing data into read.table()

I am trying to read in a tab-delimited data file with a 21 row header and
2 row footer using two calls to read.table().  Numbers of rows and columns
are variable.  The header contains information for calculating the number
of rows of data.  I can successfully pick this out and calculate the
number of rows to read, but cannot get the second read.table() to assign
this number to "nrows"  (the number is correct; if I enter it manually,
the everything works fine).  Currently the function reads all the way to
the end and crashes on the footer, since the number of fields is different
from that of the data.

I know this could easily be done with some Perl pre-processing of the
file, but it is going to run on a Windows machine and I am trying to
minimize the number of packages to download.  Nevertheless, there is the
general problem of why I cannnot pass a calculated value into the 

Code follows.

# function to read data with header and footer

# pick line 8 with the data layout information and calculate the number of
rows ...
grid.layout <- read.table(fname, as.is=T, header=F, sep="\t",
comment.char="", skip=7, nrows=1)
row.ctr <- grid.layout[4]*grid.layout[5]*grid.layout[6]*grid.layout[7]

# tells me I have the right dimensions ...

# but they do not get passed into this read.table() ...
tmp.df <- read.table(fname, as.is=T, header=T, sep="\t", comment.char="",
skip=20, nrows=row.ctr )

# end function


David R. McWilliams
dmcwilli at utk.edu

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be confidential, proprietary copyrighted and/or legally privileged, and is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please immediately return this by e-mail and then delete it.


r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list