[Rd] read.fwf doesn't work with header = TRUE (PR#8226)

Emmanuel.Paradis@mpl.ird.fr Emmanuel.Paradis at mpl.ird.fr
Fri Oct 21 18:03:59 CEST 2005


Prof Brian Ripley wrote:
> On Thu, 20 Oct 2005 Emmanuel.Paradis at mpl.ird.fr wrote:
> 
>> Full_Name: Emmanuel Paradis
>> Version: 2.1.1
>> OS: Linux
>> Submission from: (NULL) (193.49.41.105)
>>
>>
>> read.fwf(..., header = TRUE) does not work properly since:
>>
>> 1/ the original header is printed on the console and not in FILE;
>> 2/ the different 'parts' of the header should be separated with tabs
>>   to work with the call to read.table.
>>
>> Here is a suggested fix for src/library/utils/R/read.fwf.R:
>>
>> 38c38,40
>> <         cat(FILE, headerline, "\n")
>> ---
>>
>>>         headerline <- unlist(strsplit(headerline, " {1,}"))
>>>         headerline <- paste(headerline, collapse = "\t")
>>>         cat(file = FILE, headerline, "\n")
> 
> 
> Thanks, but I don't think that is right.  It assumes the header line is 
> space-delimited (or at least that spaces get converted to tabs).  We 
> have not specified the format of the header line, and it cannot usefully 
> be fixed format.  So I think we need to specify it is delimited by 'sep'
> (not tab).

I see, but suppose we read selectively some columns in a file, eg with 
widths=c(1, -4, 2), how can we know how many variables have been skipped 
and then select the appropriate names in the header line?

Here is another proposed fix, but this assumes the header line is in 
fixed-width format (as specified by 'widths'):

38c38,41
<         cat(FILE, headerline, "\n")
---
 >         head.last <- cumsum(widths)
 >         head.first <- head.last - widths + 1
 >         headerline <- substring(headerline, head.first, head.last)[drop]
 >         cat(file = FILE, headerline, "\n", sep = sep)

?read.fwf says clearly that sep is used internally.



More information about the R-devel mailing list