[Rd] Bug in read.table?

Charles C. Berry cberry at tajo.ucsd.edu
Sat Nov 6 01:17:57 CET 2010


On Fri, 5 Nov 2010, jgarcia at ija.csic.es wrote:

> Hi,
>
> I'm writting to this list as I'm puzzled about the behaviour of
> read.table(). It is hard to believe that there is a bug in this utils'
> function, but for my:
>
> R version 2.12.0 alpha (2010-09-28 r53056)
>
> I'm using scan and read.table to read a number of files, which are as:
>

There are line wraps here, so we can't just cut-and-paste.


> ---
>
> Project:     Murta Sonda
> Program:     GrafNav Version 8.30.1007
> Profile:     javier
> Source:      GPS Epochs(Combined)
> ProcessInfo: Run (1) by Unknown on 11/04/2010 at 19:05:17
>
> Datum:       WGS84, (processing datum)
> Master 1:    Name LaMurta, Status ENABLED
>             Antenna height 2.066 m, to L1-PC (NOV702GG, MeasDist 1.980 m
> to mark/ARP)
>             Position 37 49 38.15069, -1 12 27.55445, 368.197 m (WGS84,
> Ellipsoidal hgt)
> Remote:      Antenna height 1.781 m, to L1-PC (NOV702GG, MeasDist 1.695 m
> to mark/ARP)
> UTC Offset:  15 s
> Local time:  +2.0 h, CEST [Central European Savings Time]
> Geoid:       EGM2008-World.wpg (Absolute correction)
>
>      Latitude      Longitude LonTextLoTextLongitudTextL
> LatTextLaTextLatitudeTextL        H-Ell        H-MSL LocalUTCDa
> LocalUTC
>         (Deg)          (Deg) (DeMi   (Sec)  (DeMi   (Sec)           (m)
>       (m)      (DMY)       (HMS)
> 37.8275120694  -1.2077972583 001º12'28.07013"W 037º49'39.04345"N
> 368.998      318.059 25/10/2010    16:59:00
> 37.8275121083  -1.2077974806 001º12'28.07093"W 037º49'39.04359"N
> 368.994      318.055 25/10/2010    16:59:15
> 37.8275118539  -1.2077974338 001º12'28.07076"W 037º49'39.04267"N
> 368.997      318.058 25/10/2010    16:59:30
> 37.8275119923  -1.2077974626 001º12'28.07087"W 037º49'39.04317"N
> 368.998      318.060 25/10/2010    16:59:45
> 37.8275323099  -1.2078075891 001º12'28.10732"W 037º49'39.11632"N
> 368.869      317.930 25/10/2010    17:00:00
> 37.8275323374  -1.2078077002 001º12'28.10772"W 037º49'39.11641"N
> 368.866      317.927 25/10/2010    17:00:15
> 37.8275325076  -1.2078075314 001º12'28.10711"W 037º49'39.11703"N
> 368.859      317.920 25/10/2010    17:00:30
> 37.8275325306  -1.2078075056 001º12'28.10702"W 037º49'39.11711"N
> 368.861      317.922 25/10/2010    17:00:45
> 37.8275323639  -1.2078075917 001º12'28.10733"W 037º49'39.11651"N
> 368.853      317.914 25/10/2010    17:01:00
> 37.8275326222  -1.2078076861 001º12'28.10767"W 037º49'39.11744"N
> 368.857      317.918 25/10/2010    17:01:15
> ---
>

Uh, what about those quotes??

Using quote = '' yields 'dat' sans duplicates.

I'll leave it to others to decide if this is a bug.


> with a number of different records for each file.
>
> To read the data I'm using:
>
> ---
> dat.names <- scan(file.path("path_and_filename"),
>                   what="character",
>                   skip = 16, nlines=1)
> if(length(dat.names) != 8){
>    stop("Input file seems to be wrong!")}
>
> dat <- read.table(file.path("path_and_filename),
>                   header=FALSE, col.names=dat.names,
>                   skip = 18, as.is=TRUE, blank.lines.skip=FALSE)
> ---
> and systematically, I'm obtaining a number of repeated records at the
> starting of the input table (6 in this example). It is easily seen by
> looking at the field "LocalUTC":

Or looking at duplicated(dat)

HTH,

Chuck


>
>> dat
>   Latitude Longitude LonTextLoTextLongitudTextL
> LatTextLaTextLatitudeTextL   H.Ell   H.MSL LocalUTCDa LocalUTC
> 1  37.82753 -1.207808          001º12'28.10732"W
> 037º49'39.11632"N 368.869 317.930 25/10/2010 17:00:00
> 2  37.82753 -1.207808          001º12'28.10772"W
> 037º49'39.11641"N 368.866 317.927 25/10/2010 17:00:15
> 3  37.82753 -1.207808          001º12'28.10711"W
> 037º49'39.11703"N 368.859 317.920 25/10/2010 17:00:30
> 4  37.82753 -1.207808          001º12'28.10702"W
> 037º49'39.11711"N 368.861 317.922 25/10/2010 17:00:45
> 5  37.82753 -1.207808          001º12'28.10733"W
> 037º49'39.11651"N 368.853 317.914 25/10/2010 17:01:00
> 6  37.82753 -1.207808          001º12'28.10767"W
> 037º49'39.11744"N 368.857 317.918 25/10/2010 17:01:15
> 7  37.82751 -1.207797          001º12'28.07013"W
> 037º49'39.04345"N 368.998 318.059 25/10/2010 16:59:00
> 8  37.82751 -1.207797          001º12'28.07093"W
> 037º49'39.04359"N 368.994 318.055 25/10/2010 16:59:15
> 9  37.82751 -1.207797          001º12'28.07076"W
> 037º49'39.04267"N 368.997 318.058 25/10/2010 16:59:30
> 10 37.82751 -1.207797          001º12'28.07087"W
> 037º49'39.04317"N 368.998 318.060 25/10/2010 16:59:45
> 11 37.82753 -1.207808          001º12'28.10732"W
> 037º49'39.11632"N 368.869 317.930 25/10/2010 17:00:00
> 12 37.82753 -1.207808          001º12'28.10772"W
> 037º49'39.11641"N 368.866 317.927 25/10/2010 17:00:15
> 13 37.82753 -1.207808          001º12'28.10711"W
> 037º49'39.11703"N 368.859 317.920 25/10/2010 17:00:30
> 14 37.82753 -1.207808          001º12'28.10702"W
> 037º49'39.11711"N 368.861 317.922 25/10/2010 17:00:45
> 15 37.82753 -1.207808          001º12'28.10733"W
> 037º49'39.11651"N 368.853 317.914 25/10/2010 17:01:00
> 16 37.82753 -1.207808          001º12'28.10767"W
> 037º49'39.11744"N 368.857 317.918 25/10/2010 17:01:15
>
> Thanks,
>
> Javier
> ---
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Charles C. Berry                            Dept of Family/Preventive Medicine
cberry at tajo.ucsd.edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901



More information about the R-devel mailing list