Summary: [R] read.table on Mac OS X, CARBON vs. DARWIN
David R. Bickel
dbickel at mail.mcg.edu
Sat Feb 23 01:02:36 CET 2002
Adding that line didn't work for me. I get the same problem as before
(version 1.4.0):
'temp' is a two-line text file with three tab-delimited columns.
UNDER DARWIN:
> read.table('temp')
V1 V2 V3
1 AFFX-BioB-5_at -214 -139
2 AFFX-BioB-M_at -49 -11
> read.table('temp',as.is=TRUE)
stack imbalance in internal type.convert, 26 then 25stack imbalance in
Internal, 25 then 24
stack imbalance in if, 19 then 18
stack imbalance in <-, 17 then 16
stack imbalance in {, 15 then 14
stack imbalance in for, 8 then 7
stack imbalance in {, 6 then 5
V1 V2 V3
1 AFFX-BioB-5_at -214 -139
2 AFFX-BioB-M_at -49 -11
Error: unprotect(): stack imbalance
UNDER CARBON:
> read.table('temp')
V1 V2 V3
1 AFFX-BioB-5_at -214 -139
2 AFFX-BioB-M_at -49 -11
> read.table('temp',as.is=TRUE)
V1 V2 V3
1 AFFX-BioB-5_at -214 -139
2 AFFX-BioB-M_at -49 -11
On Friday, February 22, 2002, at 09:00 X, Meinhard Ploner wrote:
> Thanks a lot, James!!
> The problem is fixed. On the version 1.4.0 Mac/darwin (the latest
> available version for this system) the function read.table (which is
> called from read.delim etc., too) has the bug you explained.
>
> Inserting the row
> nlines <- nlines+1
> after
> lines <- c(lines, line)
> removes this bug.
> M.
>
>
> On Friday, February 22, 2002, at 02:33 PM, james.holtman at convergys.com
> wrote:
>
>>
>> If you can not the the latest 1.4.1, here is a patch (adds one line to
>> read.table) that will fix it on your current system.
>>
>>> The 'read.table' function appears to be up to 10X slower in R 1.4.0
>>> than
>> R
>>> 1.3.1 for some of the data sets I read in. I was comparing the source
>> code
>>> for the 2 versions and see that it was rewritten in R 1.4.0.
>>>
>>> I think I found out what part of the problem might be. I was
>>> comparing
>>> R1.3.1 and R1.4.0 code and it appears that a statement is missing in
>>> some
>>> of the code for R 1.4. This is the section of code at the beginning
>>> of
>>> read.table. The loop starting with 'while (nlines < 5)' will read in
>>> the
>>> entire file, because there is no increment of 'nlines' in the loop. I
>>> traced through the code and this is what was happening. It then
>>> does a
>>> 'pushBack' of the entire file. In tracing through the code, this is
>> where
>>> is appears to be taking the time. With the change noted below, the
>>> speed
>>> was similar to R 1.3.1 and the results were the same.
>>>
>>> Here is the current code with what I think is the additional statement
>>> needed:
>>>
>>> =================part of read.table========
>>>
>>> nlines <- 0
>>> lines <- NULL
>>> while (nlines < 5) {
>>> line <- readLines(file, 1, ok = TRUE)
>>> if (length(line) == 0)
>>> break
>>> if (blank.lines.skip && length(grep("^[ \\t]*$", line)))
>>> next
>>> if (length(comment.char) && nchar(comment.char)) {
>>> pattern <- paste("^[ \\t]*", substring(comment.char,
>>> 1, 1), sep = "")
>>> if (length(grep(pattern, line)))
>>> next
>>> }
>>> lines <- c(lines, line)
>>> #
>>> # additional line required
>>> #
>>> nlines <- nlines+1
>>> }
>>
>>> --
>>
>>
>>
>>
>> Meinhard Ploner <meinhardploner at gmx.net> on 02/22/2002 03:17:34
>>
>> To: james.holtman at convergys.com
>> cc:
>> Subject: Re: [R] read.table on Mac OS X, CARBON vs. DARWIN
>>
>>
>> Yes. Thanks a lot.
>> I had the 1.4.0 because on Fink the latest version (1.4.1) is not
>> available. However, I will download it from the CRAN.
>> Meinhard
>>
>>
>> On Thursday, February 21, 2002, at 10:29 PM,
>> james.holtman at convergys.com wrote:
>>
>>> read.table did have a bug in it in 1.4.0. It was fixed in 1.4.1. Is
>>> that
>>> what you are running with?
>>
>>
>>
>>
>>
>> --
>>
>> NOTICE: The information contained in this electronic mail
>> transmission is
>> intended by Convergys Corporation for the use of the named individual
>> or
>> entity to which it is directed and may contain information that is
>> privileged or otherwise confidential. If you have received this
>> electronic
>> mail transmission in error, please delete it from your system without
>> copying or forwarding it, and notify the sender of the error by reply
>> email
>> or by telephone (collect), so that the sender's address records can be
>> corrected.
>>
>>
>>
http://www.mcg.edu/research/biostat/bickel.html
David R. Bickel, PhD
Assistant Professor
Medical College of Georgia
Office of Biostatistics and Bioinformatics
1120 Fifteenth St., AE-3037
Augusta, GA 30912-4900
Tel.: 706-721-4697; Fax: 706-721-6294
E-mail: dbickel at mail.mcg.edu or bickel at prueba.info
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 4761 bytes
Desc: not available
Url : https://stat.ethz.ch/pipermail/r-help/attachments/20020222/19523d0e/attachment.bin
More information about the R-help
mailing list