[Rd] read.table: wrong error message? (PR#10592)
Peter Dalgaard
P.Dalgaard at biostat.ku.dk
Mon Jan 21 12:34:04 CET 2008
david.reitter at gmail.com wrote:
> --Apple-Mail-44--797532055
> Content-Type: text/plain;
> charset=US-ASCII;
> format=flowed;
> delsp=yes
> Content-Transfer-Encoding: 7bit
>
> I believe read.table may report misleading errors. In this example,
> where a header line in a file has an incorrect number of row names (28
> instead of 29), I get the error message "duplicate row.names are not
> allowed".
>
> However, I cannot not find any duplicate row names. Fixing the header
> line by adding an extra row name, however, avoids the error.
>
> The behavior is confusing - I would expect a different error message,
> even if I should have quoted some row names.
>
Arguably, read.table() is too smart for its own good at times, but this
_is_ documented behaviour:
If 'row.names' is not specified and the header line has one less
entry than the number of columns, the first column is taken to be
the row names. This allows data frames to be read in from the
format in which they are printed. If 'row.names' is specified and
does not refer to the first column, that column is discarded from
such files.
So, if your COLUMN (sic!) name count is off by one, read.table() takes
the first variable as row.names, and when this is a 0/1 variable, you
will have duplicate names.
> I am attaching a sample file that reproduces the problem with R --
> vanilla, and then: read.table("m2"). I get it in 2.5.1 and 2.6.0.
>
> PS.:
>
> R version 2.6.0 (2007-10-03)
> ...
> > x <- read.table("m2", header=T)
> Error in read.table("m2", header = T) :
> duplicate 'row.names' are not allowed
> > traceback()
> 2: stop("duplicate 'row.names' are not allowed")
> 1: read.table("m2", header = T)
> >
>
>
> --Apple-Mail-44--797532055
> Content-Disposition: attachment;
> filename=m2
> Content-Type: application/octet-stream;
> x-unix-mode=0644;
> name="m2"
> Content-Transfer-Encoding: 7bit
>
> primed rule role dist starttime target.utt rule.freq primeperiod.length dialogue.length dialogue.id words.repeated words.repeated.prop head.repeated head.freq head.pos prime.gaze target.gaze eyecontact familiar convseq length doc.score friend task.familiar same.specrule derivation pathlen distituent
> 0 vp---vbg-vp i 2 5.67 4 False 172 70 261.5322 1 0 0/7 True na None None None 1 0 1 7 135 - - 1 None - -
> 0 vp---to-vp r 1 6.03 4 False 758 13 261.5322 1 0 0/6 True na None None None 1 0 1 6 135 - - 1 None - -
> 0 vp---to-vp i 2 6.03 4 False 758 70 261.5322 1 0 0/6 True na None None None 1 0 1 6 135 - - 1 None - -
> 0 s---cc-s i 1 11.3 5 False 813 83 261.5322 1 0 2/9 True na None None None 1 0 1 9 135 - - 1 None - -
> 0 s---cc-s i 3 11.3 5 False 813 70 261.5322 1 0 0/9 True na None None None 1 0 1 9 135 - - 1 None - -
> 0 s---advp-s i 3 12.71 5 False 440 70 261.5322 1 0 0/8 True na None None None 1 0 1 8 135 - - 1 None - -
> 1 vp---ber-vp i 1 12.98 5 False 406 83 261.5322 1 0 2/7 True na None None None 1 0 1 7 135 - - 1 None - -
> 1 vp---vbg-vp i 1 13.11 5 False 172 83 261.5322 1 0 3/7 True na None None None 1 0 1 7 135 - - 1 None - -
> 1 vp---to-vp i 1 13.52 5 False 758 83 261.5322 1 0 2/6 True na None None None 1 0 1 6 135 - - 1 None - -
> 1 pp---ql-rp-pp i 1 14.12 5 False 282 83 261.5322 1 0 0/3 True na None None None 1 0 1 3 135 - - 1 None - -
> 0 pp---ql-rp-pp r 2 14.12 5 False 282 13 261.5322 1 0 0/3 True na None None None 1 0 1 3 135 - - 1 None - -
> 0 ap---ap r 2 18.75 5 False 120 13 261.5322 1 0 0/1 True na None None None 1 0 1 1 135 - - 1 None - -
> 0 pp---pp-pp i 3 22.7 6 False 429 13 261.5322 1 0 0/4 True na None None None 1 0 1 4 135 - - 1 None - -
> 1 pp---ql-rp r 2 22.7 6 False 505 83 261.5322 1 0 1/2 True na None None None 1 0 1 2 135 - - 1 None - -
> 0 pp---rp-pp r 2 23.54 6 False 1124 83 261.5322 1 0 0/2 True na None None None 1 0 1 2 135 - - 1 None - -
> 0 pp---pp-cc-rb-pp i 2 25.27 7 False 64 200 261.5322 1 0 1/6 True na None None None 1 0 1 6 135 - - 1 None - -
> 0 pp---pp-cc-rb-pp r 4 25.27 7 False 64 13 261.5322 1 0 0/6 True na None None None 1 0 1 6 135 - - 1 None - -
> 1 pp---ql-rp-pp i 2 25.99 7 False 282 200 261.5322 1 0 2/3 True na None None None 1 0 1 3 135 - - 1 None - -
> 1 pp---ql-rp-pp i 3 25.99 7 False 282 83 261.5322 1 0 0/3 True na None None None 1 0 1 3 135 - - 1 None - -
>
> --Apple-Mail-44--797532055--
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list