[R] Data separated by spaces, getting data into R using field lengths

Lauri Nikkinen lauri.nikkinen at iki.fi
Tue Sep 8 14:21:53 CEST 2009


This data is from database and the maximum length of a field is
defined. I mean that every column has a maximum length and I want to
use this maximum length as a separator. So if one "cell" in that
column is shorter than the maximum, "cell" should be padded with white
spaces or something like that. This seems to be hard to explain.

Regards,
L

2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>:
> On 9/8/2009 8:07 AM, Lauri Nikkinen wrote:
>>
>> Thanks, I tried it but I got
>>
>>> varlength <- c(2, 2, 18, 5, 18)
>>> read.fwf("c:temppi.txt", widths=varlength)
>>
>>  V1 V2                 V3    V4   V5
>> 1 DF 12  This is an exampl e 1 T  his
>> 2 DF 12  This is an 1232 T his i    s
>> 3 DF 14  This is 12334 Thi s is   an
>> 4 DF 15  This 23 This is a n exa mple
>>
>> Which is not the way I want it.
>
> It looks as though that's because you don't have fixed width data.  " This
> is an example" is 19 chars, including the leading space.  You told R it was
> 18.  " This is an " is only 12 characters.
>
> I would say you have two fixed width fields, and three varying fields, with
> no delimiters.  If the middle one of the three always contains digits and
> the others don't, you can probably extract them using sub(), but you can't
> use any of the read.* functions to do this:  your format is too strange.
>
> Duncan Murdoch
>
>>
>> structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", class
>> = "factor"),
>>    V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L,
>>    1L), .Label = c(" This 23 This is a", " This is 12334 Thi",
>>    " This is an 1232 T", " This is an exampl"), class = "factor"),
>>    V4 = structure(c(1L, 2L, 4L, 3L), .Label = c("e 1 T", "his i",
>>    "n exa", "s is "), class = "factor"), V5 = structure(c(2L,
>>    4L, 1L, 3L), .Label = c("an ", "his", "mple", "s"), class =
>> "factor")), .Names = c("V1",
>> "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA,
>> -4L))
>>
>> Any ideas?
>> -L
>>
>> 2009/9/8 Duncan Murdoch <murdoch at stats.uwo.ca>:
>>>
>>> On 9/8/2009 7:53 AM, Lauri Nikkinen wrote:
>>>>
>>>> I have a text file similar to this (separated by spaces):
>>>>
>>>> x <- "DF12 This is an example 1 This
>>>> DF12 This is an 1232 This is
>>>> DF14 This is 12334 This is an
>>>> DF15 This 23 This is an example
>>>> "
>>>>
>>>> and I know the field lengths of each variable (there is 5 variables in
>>>> this data set), which are:
>>>>
>>>> varlength <- c(2, 2, 18, 5, 18)
>>>>
>>>> How can I import this kind of data into R, using the varlength
>>>> variable as an field separator indicator?
>>>
>>> See ?read.fwf.
>>>
>>> Duncan Murdoch
>>>
>
>




More information about the R-help mailing list