[R] manipulating strings
Marc Schwartz
MSchwartz at MedAnalytics.com
Sun Aug 8 21:19:40 CEST 2004
On Sun, 2004-08-08 at 13:58, Stephen Nyangoma wrote:
> Hi
> I have a called fil consisting of the following strings.
>
>
> > fil
> [1] " 102.2 639" " 104.2 224" " 105.1 1159" " 107.1 1148"
> " 108.1 1376"
> [6] " 109.2 1092" " 111.2 1238" " 112.2 349" " 113.1 1204"
> " 114.1 537"
> [11] " 115.0 303" " 116.1 490" " 117.2 202" " 118.1 1864"
> " 119.0 357"
>
>
> I want to get a data frame like
>
> Time Obs
> 102.2 639
> 104.2 224
> 105.1 1159
> 107.1 1148
> 108.1 1376
> 109.2 1092
> 111.2 1238
> 112.2 349
> 113.1 1204
> 114.1 537
> etc
>
> Can anyone see an efficient way of doing this?
>
> Thanks. Stephen
Try this:
# Create strings
MyStrings <- c(" 102.2 639", " 104.2 224", " 105.1 1159",
" 107.1 1148", " 108.1 1376", " 109.2 1092",
" 111.2 1238", " 112.2 349", " 113.1 1204",
" 114.1 537", " 115.0 303", " 116.1 490",
" 117.2 202", " 118.1 1864", " 119.0 357")
> MyStrings
[1] " 102.2 639" " 104.2 224" " 105.1 1159" " 107.1 1148"
[5] " 108.1 1376" " 109.2 1092" " 111.2 1238" " 112.2 349"
[9] " 113.1 1204" " 114.1 537" " 115.0 303" " 116.1 490"
[13] " 117.2 202" " 118.1 1864" " 119.0 357"
# Now convert to a data frame, by first using strsplit(), to break up
# each of the vector elements into three components, using " " as a
# split character. This returns a list, which we then convert to vector,
# using unlist(). Then use matrix() to convert the vector into a two
# dimensional object with 3 cols. Use 'byrow = TRUE' so that we fill
# the matrix row by row. Then take only the second and third columns
# from the matrix and convert them into a data frame.
df <- as.data.frame(matrix(unlist(strsplit(MyStrings, split = " ")),
ncol = 3, byrow = TRUE)[, 2:3])
# Finally, set the colnames
colnames(df) <- c("Time", "Obs")
> df
Time Obs
1 102.2 639
2 104.2 224
3 105.1 1159
4 107.1 1148
5 108.1 1376
6 109.2 1092
7 111.2 1238
8 112.2 349
9 113.1 1204
10 114.1 537
11 115.0 303
12 116.1 490
13 117.2 202
14 118.1 1864
15 119.0 357
Note that the above presumes that your strings (character vectors) have
a leading " " in them and the Time and Obs elements are also separated
by a " " in each.
See ?strsplit for more information.
HTH,
Marc Schwartz
More information about the R-help
mailing list