[R] read.fwf and header
Daniel Nordlund
res90sx5 at verizon.net
Mon Oct 30 21:33:12 CET 2006
Gregor,
According to the help for read.fwf, sep needs to be set to a value that occurs only in the header record. I changed the spaces to commas in the header record of your example and used the following syntax and was able to read the file just fine.
new.data<-read.fwf(file="test.txt", widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 19),
header=TRUE, sep=',')
Hope this is helpful,
Dan
Daniel Nordlund
Bothell, WA USA
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch]
> On Behalf Of Gregor Gorjanc
> Sent: Monday, October 30, 2006 10:52 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] read.fwf and header
>
> Hi!
>
> I have data (also in attached file) in the following form:
>
> num1 num2 num3 int1 fac1 fac2 cha1 cha2 Date POSIXt
> 1 1 f q 1900-01-01 1900-01-01 01:01:01
> 2 1.0 1316666.5 2 a g r z 1900-01-01 01:01:01
> 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01
> 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01
> 5 2.5 829737.4 d j u w 1900-01-01
> 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01
> 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01
> 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01
> 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01
> 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01
> 5.5 988481.4 10 j q 1900-01-01 1900-01-01 01:01:01
>
> This is a FWF (fixed width format) file. I can not use read.table here,
> because of missing values. I have tried with the following
>
> > read.fwf(file="test.txt", widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20),
> header=TRUE)
>
> Error in read.table(file = FILE, header = header, sep = sep, as.is =
> as.is, :
> more columns than column names
>
> I could use:
>
> > read.fwf(file="test.txt", widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20),
> header=FALSE, skip=1)
> V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
> 1 1 NA NA 1 f q 1900-01-01 1900-01-01 01:01:01
> 2 2 1.0 1316666.5 2 a g r z 1900-01-01 01:01:01
> 3 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01
> 4 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01
> 5 5 2.5 829737.4 NA d j u w 1900-01-01
> 6 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01
> 7 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01
> 8 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01
> 9 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01
> 10 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01
> 11 NA 5.5 988481.4 10 j q 1900-01-01 1900-01-01 01:01:01
>
> Does anyone have a clue, how to get above result with header?
>
> Thanks!
>
> --
> Lep pozdrav / With regards,
> Gregor Gorjanc
> ----------------------------------------------------------------------
> University of Ljubljana PhD student
> Biotechnical Faculty
> Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan
> Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si
>
> SI-1230 Domzale tel: +386 (0)1 72 17 861
> Slovenia, Europe fax: +386 (0)1 72 17 888
>
> ----------------------------------------------------------------------
> "One must learn by doing the thing; for though you think you know it,
> you have no certainty until you try." Sophocles ~ 450 B.C.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list