Summary: [R] read.table on Mac OS X, CARBON vs. DARWIN
Meinhard Ploner
meinhardploner at gmx.net
Fri Feb 22 15:00:04 CET 2002
Thanks a lot, James!!
The problem is fixed. On the version 1.4.0 Mac/darwin (the latest
available version for this system) the function read.table (which is
called from read.delim etc., too) has the bug you explained.
Inserting the row
nlines <- nlines+1
after
lines <- c(lines, line)
removes this bug.
M.
On Friday, February 22, 2002, at 02:33 PM, james.holtman at convergys.com
wrote:
>
> If you can not the the latest 1.4.1, here is a patch (adds one line to
> read.table) that will fix it on your current system.
>
>> The 'read.table' function appears to be up to 10X slower in R 1.4.0
>> than
> R
>> 1.3.1 for some of the data sets I read in. I was comparing the source
> code
>> for the 2 versions and see that it was rewritten in R 1.4.0.
>>
>> I think I found out what part of the problem might be. I was comparing
>> R1.3.1 and R1.4.0 code and it appears that a statement is missing in
>> some
>> of the code for R 1.4. This is the section of code at the beginning of
>> read.table. The loop starting with 'while (nlines < 5)' will read in
>> the
>> entire file, because there is no increment of 'nlines' in the loop. I
>> traced through the code and this is what was happening. It then
>> does a
>> 'pushBack' of the entire file. In tracing through the code, this is
> where
>> is appears to be taking the time. With the change noted below, the
>> speed
>> was similar to R 1.3.1 and the results were the same.
>>
>> Here is the current code with what I think is the additional statement
>> needed:
>>
>> =================part of read.table========
>>
>> nlines <- 0
>> lines <- NULL
>> while (nlines < 5) {
>> line <- readLines(file, 1, ok = TRUE)
>> if (length(line) == 0)
>> break
>> if (blank.lines.skip && length(grep("^[ \\t]*$", line)))
>> next
>> if (length(comment.char) && nchar(comment.char)) {
>> pattern <- paste("^[ \\t]*", substring(comment.char,
>> 1, 1), sep = "")
>> if (length(grep(pattern, line)))
>> next
>> }
>> lines <- c(lines, line)
>> #
>> # additional line required
>> #
>> nlines <- nlines+1
>> }
>
>> --
>
>
>
>
> Meinhard Ploner <meinhardploner at gmx.net> on 02/22/2002 03:17:34
>
> To: james.holtman at convergys.com
> cc:
> Subject: Re: [R] read.table on Mac OS X, CARBON vs. DARWIN
>
>
> Yes. Thanks a lot.
> I had the 1.4.0 because on Fink the latest version (1.4.1) is not
> available. However, I will download it from the CRAN.
> Meinhard
>
>
> On Thursday, February 21, 2002, at 10:29 PM,
> james.holtman at convergys.com wrote:
>
>> read.table did have a bug in it in 1.4.0. It was fixed in 1.4.1. Is
>> that
>> what you are running with?
>
>
>
>
>
> --
>
> NOTICE: The information contained in this electronic mail transmission
> is
> intended by Convergys Corporation for the use of the named individual or
> entity to which it is directed and may contain information that is
> privileged or otherwise confidential. If you have received this
> electronic
> mail transmission in error, please delete it from your system without
> copying or forwarding it, and notify the sender of the error by reply
> email
> or by telephone (collect), so that the sender's address records can be
> corrected.
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 3428 bytes
Desc: not available
Url : https://stat.ethz.ch/pipermail/r-help/attachments/20020222/c18a7c14/attachment.bin
More information about the R-help
mailing list