[R] re ading tokens
Dieter Menne
dieter.menne at menne-biomed.de
Tue Nov 3 09:02:25 CET 2009
j daniel wrote:
>
> I am not familiar with processing text in R. Can someone tell me how to
> read each line of words as separate elements in a list?
>
> FE, I would like to turn:
>
> word1 word2 word3
> word2 word4
>
> into a list of length two with three character elements in the first list
> and two elements in the second. I know that this should be easy, but I am
> a little confused by the text functions.
>
You could use scan. Have a look at package gsubfn, where there is a demo,
that show additional features you are going to use
library(gsubfn)
demo(gsubfn-gries)
....
The example code is a bit overnested, but to better understand what is going
on, unwrap it:
So
tail(sort(table(unlist(strapply(Lines1, "\\w+", perl = TRUE)))))
is:
x1 = strapply(Lines1, "\\w+", perl = TRUE)
x1
x2 = ulist(x2)
x2
x3 = table(x2)
x3
x4 = sort(x3)
x4
tail(x4)
Dieter
--
View this message in context: http://old.nabble.com/reading-tokens-tp26159931p26160018.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list