[R] re ading tokens

Dieter Menne dieter.menne at menne-biomed.de
Tue Nov 3 09:02:25 CET 2009




j daniel wrote:
> 
> I am not familiar with processing text in R.  Can someone tell me how to
> read each line of words as separate elements in a list?
> 
> FE, I would like to turn:
> 
> word1 word2 word3
> word2 word4
> 
> into a list of length two with three character elements in the first list
> and two elements in the second.  I know that this should be easy, but I am
> a little confused by the text functions.
> 

You could use scan. Have a look at package gsubfn, where there is a demo,
that show additional features you are going to use

library(gsubfn)
demo(gsubfn-gries)
....

The example code is a bit overnested, but to better understand what is going
on, unwrap it:

So 
 tail(sort(table(unlist(strapply(Lines1, "\\w+", perl = TRUE)))))

is:

x1 = strapply(Lines1, "\\w+", perl = TRUE)
x1
x2 = ulist(x2)
x2
x3 = table(x2)
x3
x4 = sort(x3)
x4
tail(x4)



Dieter




-- 
View this message in context: http://old.nabble.com/reading-tokens-tp26159931p26160018.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list