[R] Data separated by spaces, getting data into R using field lengths

Petr PIKAL petr.pikal at precheza.cz
Tue Sep 8 14:43:10 CEST 2009


Hi

what about reading each line by readLine and then split it to desired 
portions?

x<-paste(letters, collapse="")
substring(x, c(1,3,5),c(2,4,15))

Regards
Petr


r-help-bounces at r-project.org napsal dne 08.09.2009 14:21:53:

> This data is from database and the maximum length of a field is
> defined. I mean that every column has a maximum length and I want to
> use this maximum length as a separator. So if one "cell" in that
> column is shorter than the maximum, "cell" should be padded with white
> spaces or something like that. This seems to be hard to explain.
> 
> Regards,
> L
> 
> 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>:
> > On 9/8/2009 8:07 AM, Lauri Nikkinen wrote:
> >>
> >> Thanks, I tried it but I got
> >>
> >>> varlength <- c(2, 2, 18, 5, 18)
> >>> read.fwf("c:temppi.txt", widths=varlength)
> >>
> >>  V1 V2                 V3    V4   V5
> >> 1 DF 12  This is an exampl e 1 T  his
> >> 2 DF 12  This is an 1232 T his i    s
> >> 3 DF 14  This is 12334 Thi s is   an
> >> 4 DF 15  This 23 This is a n exa mple
> >>
> >> Which is not the way I want it.
> >
> > It looks as though that's because you don't have fixed width data.  " 
This
> > is an example" is 19 chars, including the leading space.  You told R 
it was
> > 18.  " This is an " is only 12 characters.
> >
> > I would say you have two fixed width fields, and three varying fields, 
with
> > no delimiters.  If the middle one of the three always contains digits 
and
> > the others don't, you can probably extract them using sub(), but you 
can't
> > use any of the read.* functions to do this:  your format is too 
strange.
> >
> > Duncan Murdoch
> >
> >>
> >> structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class
> >> = "factor"),
> >>    V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L,
> >>    1L), .Label = c(" This 23 This is a", " This is 12334 Thi",
> >>    " This is an 1232 T", " This is an exampl"), class = "factor"),
> >>    V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i",
> >>    "n exa", "s is "), class = "factor"), V5 = structure(c(2L,
> >>    4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class =
> >> "factor")), .Names = c("V1",
> >> "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA,
> >> -4L))
> >>
> >> Any ideas?
> >> -L
> >>
> >> 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>:
> >>>
> >>> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote:
> >>>>
> >>>> I have a text file similar to this (separated by spaces):
> >>>>
> >>>> x <- "DF12 This is an example 1 This
> >>>> DF12 This is an 1232 This is
> >>>> DF14 This is 12334 This is an
> >>>> DF15 This 23 This is an example
> >>>> "
> >>>>
> >>>> and I know the field lengths of each variable (there is 5 variables 
in
> >>>> this data set), which are:
> >>>>
> >>>> varlength <- c(2, 2, 18, 5, 18)
> >>>>
> >>>> How can I import this kind of data into R, using the varlength
> >>>> variable as an field separator indicator?
> >>>
> >>> See ?read.fwf.
> >>>
> >>> Duncan Murdoch
> >>>
> >
> >
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list