[Rd] read.table on long lines buggy (PR#13626)

manikandan_narayanan at merck.com manikandan_narayanan at merck.com
Fri Mar 27 03:18:34 CET 2009


Full_Name: Manikandan Narayanan
Version: 2.8.1
OS: linux-gnu
Submission from: (NULL) (155.91.28.231)


Hi R-folks, 
  I have two three-line text files: tst1, tst2 (they are the same except that
the second line is longer in tst1; see cat() cmds below). 

  read.table is only able to read the 3rd line in tst1, however reads tst2
correctly as shown below. This happens both in R 2.5.1 (windows) and R 2.8.1
(linux-gnu). 

  Seems to be an issue with read.table operating on long lines. It caused me
quite some trouble before uncovering this one from reading a bigger file I had!
Please take care of this one or suggest me safer ways of working with long
lines!

Thanks,  
Mani

> cat(file="tst1", "a:15S_RRNA, 21S_RRNA, AAC1, AAC3\nb:AAP1, ACN9, ALG1, ALG11,
ALG12, ALG13, ALG14, ALG2, ALG3, ALG5, ALG6, ALG7, ALG8, ALG9, AMS1, ANP1, ARA1,
ATH1, BCH1, BCH2, BMH1, BMH2, BNI4, BUD7, CAX4, CDC19, CHS3, CHS5, CHS6, CHS7,
CIT2, CTS1, CWH41, DDP1, DIE2, DIP5, DLD1, DOG1, DOG2, DPM1, ELM1, ENO1, ENO2,
EOS1, ERD1, EXG1, FBA1, FBP1, FBP26, FDH1, FKS1, GAC1, GAL1, GAL10, GAL2, GAL3,
GAL4, GAL7, GAL80, GCY1, GDA1, GDB1, GFA1, GIP2, GLC3, GLC7, GLC8, GLG1, GLG2,
GLK1, GLO2, GLO4, GNA1, GND1, GND2, GNT1, GPH1, GPM1, GRE3, GSC2, GSY1, GSY2,
GTB1, GUT2, HAP4, HKR1, HOC1, HOR2, HPF1, HXK1, HXK2, HXT4, ICL1, IMP2', INM1,
INM2, ITR1, KAR2, KEG1, KNH1, KRE2, KRE5\nc:ABC1")
> read.table("tst1", sep=":", stringsAsFactors=F)[,1]
[1] "c"
Warning message:
In read.table("tmp1", sep = ":", stringsAsFactors = F) :
  incomplete final line found by readTableHeader on 'tmp1'

> cat(file="tst2", "a:15S_RRNA, 21S_RRNA, AAC1, AAC3\nb:AAP1, ACN9, ALG1, ALG11,
ALG12, ALG13, ALG14, ALG2, ALG3, ALG5, ALG6, ALG7, ALG8, ALG9, AMS1, ANP1, ARA1,
ATH1, BCH1, BCH2, BMH1, BMH2, BNI4, BUD7, CAX4, CDC19, CHS3, CHS5, CHS6, CHS7,
CIT2, CTS1, CWH41, DDP1, DIE2, DIP5, DLD1, DOG1, DOG2, DPM1, ELM1, ENO1, ENO2,
EOS1, ERD1, EXG1, FBA1, FBP1, FBP26, FDH1, FKS1, GAC1, GAL1, GAL10, GAL2, GAL3,
GAL4, GAL7, GAL80, GCY1, GDA1, GDB1, GFA1, GIP2, GLC3, GLC7, GLC8, GLG1, GLG2,
GLK1, GLO2, GLO4, GNA1, GND1, GND2, GNT1, GPH1\nc:ABC1\n")
> read.table("tst2", sep=":", stringsAsFactors=F)[,1]
[1] "a" "b" "c"



More information about the R-devel mailing list