[R] textConnection taking a long time to open a big string

james.holtman@convergys.com james.holtman at convergys.com
Wed Apr 30 20:45:08 CEST 2003


I was using 'textConnection' to read in a file with about 11,000 lines so I
could detect lines with incomplete data and delete them and then read them
in with 'scan'.  I am using 1.7.0 on Windows.  Here is the output from the
script and it was using 51 seconds just to do the textConnection.

Is there a limit on how large a text object can be to be used with
'textConnection'?

########   script output    ################
> x.1 <- scan("/mpstat.ssgdbsv4.030430.txt",what='',sep='\n')
Read 11299 items
> str(x.1)
 chr [1:11299] "8.3155  32   71   4 1907   122    0 1130  105  167  216
0  3686   32  13  37  18" ...
> unix.time(x.in <- textConnection(x.1))  # this takes a long time
[1] 51.96  0.01 53.20    NA    NA
> sum(nchar(x.1))  # total number of characters in the vector
[1] 944525
> unix.time(x.c <- count.fields(x.in))    # this goes pretty fast
[1] 0.14 0.00 0.14   NA   NA
> table(x.c)      # detect incomplete lines
x.c
    3     6    17
    1     1 11297
>
> version
         _
platform i386-pc-mingw32
arch     i386
os       mingw32
system   i386, mingw32
status
major    1
minor    7.0
year     2003
month    04
day      16
language R
>

--
"NOTICE:  The information contained in this electronic mail tran... {{dropped}}



More information about the R-help mailing list