Summary: [R] read.table on Mac OS X, CARBON vs. DARWIN

Meinhard Ploner meinhardploner at gmx.net
Fri Feb 22 15:00:04 CET 2002


Thanks a lot, James!!
The problem is fixed. On the version 1.4.0 Mac/darwin (the latest 
available version for this system) the function read.table (which is 
called from read.delim etc., too) has the bug you explained.

Inserting the row
	nlines <- nlines+1
after
	 lines <- c(lines, line)
removes this bug.
M.


On Friday, February 22, 2002, at 02:33  PM, james.holtman at convergys.com 
wrote:

>
> If you can not the the latest 1.4.1, here is a patch (adds one line to
> read.table) that will fix it on your current system.
>
>> The 'read.table' function appears to be up to 10X slower in R 1.4.0 
>> than
> R
>> 1.3.1 for some of the data sets I read in.  I was comparing the source
> code
>> for the 2 versions and see that it was rewritten in R 1.4.0.
>>
>> I think I found out what part of the problem might be.  I was comparing
>> R1.3.1 and R1.4.0 code and it appears that a statement is missing in 
>> some
>> of the code for R 1.4.  This is the section of code at the beginning of
>> read.table.  The loop starting with 'while (nlines < 5)' will read in 
>> the
>> entire file, because there is no increment of 'nlines' in the loop.  I
>> traced through the code  and this is what was happening.  It then 
>> does a
>> 'pushBack' of the entire file.  In tracing through the code, this is
> where
>> is appears to be taking the time.  With the change noted below, the 
>> speed
>> was similar to R 1.3.1 and the results were the same.
>>
>> Here is the current code with what I think is the additional statement
>> needed:
>>
>> =================part of read.table========
>>
>>     nlines <- 0
>>     lines <- NULL
>>     while (nlines < 5) {
>>         line <- readLines(file, 1, ok = TRUE)
>>         if (length(line) == 0)
>>             break
>>         if (blank.lines.skip && length(grep("^[ \\t]*$", line)))
>>             next
>>         if (length(comment.char) && nchar(comment.char)) {
>>             pattern <- paste("^[ \\t]*", substring(comment.char,
>>                 1, 1), sep = "")
>>             if (length(grep(pattern, line)))
>>                 next
>>         }
>>         lines <- c(lines, line)
>>        #
>>        #  additional line required
>>        #
>>        nlines <- nlines+1
>>     }
>
>> --
>
>
>
>
> Meinhard Ploner <meinhardploner at gmx.net> on 02/22/2002 03:17:34
>
> To:   james.holtman at convergys.com
> cc:
> Subject:  Re: [R] read.table on Mac OS X, CARBON vs. DARWIN
>
>
> Yes. Thanks a lot.
> I had the 1.4.0 because on Fink the latest version (1.4.1) is not
> available. However, I will download it from the CRAN.
> Meinhard
>
>
> On Thursday, February 21, 2002, at 10:29  PM,
> james.holtman at convergys.com wrote:
>
>> read.table did have a bug in it in 1.4.0.  It was fixed in 1.4.1.  Is
>> that
>> what you are running with?
>
>
>
>
>
> --
>
> NOTICE:  The information contained in this electronic mail transmission 
> is
> intended by Convergys Corporation for the use of the named individual or
> entity to which it is directed and may contain information that is
> privileged or otherwise confidential.  If you have received this 
> electronic
> mail transmission in error, please delete it from your system without
> copying or forwarding it, and notify the sender of the error by reply 
> email
> or by telephone (collect), so that the sender's address records can be
> corrected.
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 3428 bytes
Desc: not available
Url : https://stat.ethz.ch/pipermail/r-help/attachments/20020222/c18a7c14/attachment.bin


More information about the R-help mailing list