[Rd] read.table() errors with tab as separator (PR#9061)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Jul 5 12:32:52 CEST 2006
On Wed, 5 Jul 2006, Peter Dalgaard wrote:
> John.Maindonald at anu.edu.au writes:
>
>> (1) read.table(), with sep="\t", identifies 13 our of 1400 records,
>> in a file with 1400 records of 3 fields each, as having only 2 fields.
>> This happens under version 2.3.1 for Windows as well as with
>> R 2.3.1 for Mac OS X, and with R-devel under Mac OS X.
>> [R version 2.4.0 Under development (unstable) (2006-07-03 r38478)]
>>
>> (2) Using read.table() with sep="\t", the first 1569 records only
>> of a 1821 record file are input. The file has exactly two fields
>> in each record, and the minimum length of the second field is
>> 1 character. If however I extract lines 1561 to 1650 from the
>> file (the file "short.txt" below), all 90 lines are input.
>
> Notice that the single quote is a quote character in read.table (as
> opposed to read.delim, which uses only the double quote, to cater for
> TAB-separated files from Excel & friends).
>
>> [1] "865\tlinear model (lm)! Cook's distance\t152"
> ^
> !!!!
>
> (This reminds me that we probably should shift the default for
> comment.char too since it leads to similar issues, but it seems not to
> be the problem in this case.)
This seems to imply that we should change the default for 'quote': to do
so could break a lot of scripts. (Given how long the default has been
comment.char="#", I doubt if we should change that either.)
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list