[R] Reading a tab delimted file of varying length using read.table

Uwe Ligges ligges at statistik.tu-dortmund.de
Mon Jan 18 00:43:05 CET 2016


Dear Rolf,

I'll take a look how to fix it tomorrow, your proposal is very welocme, 
of course,

Best,
Uwe


On 18.01.2016 00:01, Rolf Turner wrote:
> On 18/01/16 10:48, Uwe Ligges wrote:
>> This is not a tab delimited file (as you apparently assume given the
>> code), but a fixed width format, hence I'd try:
>>
>> url <- "http://data.princeton.edu/wws509/datasets/divorce.dat"
>> widths <- c(9, 13, 10, 8, 10, 6)
>> f5 <- read.fwf(url, widths = widths, skip = 1, strip.white = TRUE)
>>
>> names(f5) <- as.character(unlist(read.fwf(url, widths = widths,
>> strip.white=TRUE, n=1)))
>>
>> Not sure why reading it simply with header=TRUE des not work, but no
>> time to investiagte this now.
>
> Dear Uwe,
>
> I have fiddled around a bit and the situation seems to me to be of the
> nature of a bug in read.fwf.  It would seem that in order for
> header=TRUE to work, the entries of the header need to be separated by
> the sep delimiter which defaults to "\t".  In the case in question the
> entries are separated by blanks, so presumably the header gets read in
> as a single entity, rather than 6 such, leading to a mismatch between
> the length of the header and the number of columns.
>
> It seems that the specified widths get ignored when the header line is
> dealt with.
>
> It also seems that if one specifies sep="" then the header gets read
> correctly but then strings of blanks get interpreted as field separators
> throughout and then blanks within the fields result in the
> wrong number of columns.
>
> I think that the code of read.fwf is easy enough to fix; a slight
> adjustment will make the header get treated the same way as the body of
> the file.
>
> I don't see any problems/drawbacks with so-doing, and experimenting with
> my modified function resulted in the divorce data being read in with
> header=TRUE with no problems.
>
> If this mod is made, I see no reason to keep the "sep" argument in
> read.fwf --- except maybe for backward compatibility issues, and I don't
> think there would be any since it never worked properly anyhow.
>
> cheers,
>
> Rolf
>
> P. S. I can send you my modified version of read.fwf off-list if this
> would be of any use to you.
>
> R.
>



More information about the R-help mailing list