[R] string handling
Gabor Grothendieck
ggrothendieck at gmail.com
Fri Jun 4 21:03:00 CEST 2010
Here is a slightly simpler variant of the strapply solution:
> lapply(DF, strapply, "(.)/(.)", c, simplify = rbind)
$var1
[,1] [,2]
[1,] "G" "G"
[2,] "A" "T"
[3,] "G" "G"
$var2
[,1] [,2]
[1,] "C" "T"
[2,] "C" "C"
[3,] "A" "A"
On Fri, Jun 4, 2010 at 8:08 AM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> This solution using strapply in gsubfn is along the same lines as the
> stringr solution. First we read in the data using as.is = TRUE so
> that we get character rather than factor columns. On the other hand,
> if your data is already in columns with class factor then just replace
> strappy(x, ...) with strapply(as.character(x), ...) below. Then
> lapply over the columns of DF using strapply on each one. See home
> page at http://gsubfn.googlecode.com for more.
>
>> Lines <- "var1 var2
> + 9G/G09 abd89C/T90
> + 10A/T9 32C/C
> + 90G/G A/A"
>>
>> library(gsubfn)
>> DF <- read.table(textConnection(Lines), header = TRUE, as.is = TRUE)
>> lapply(DF, function(x) strapply(x, "(.)/(.)", c, simplify = rbind))
> $var1
> [,1] [,2]
> [1,] "G" "G"
> [2,] "A" "T"
> [3,] "G" "G"
>
> $var2
> [,1] [,2]
> [1,] "C" "T"
> [2,] "C" "C"
> [3,] "A" "A"
>
>
> Also a slight simplification is possible using gsubfn's capability of
> representing a one line function as a formula. We just preface lapply
> with fn$ and then formulas appearing in the arguments (subject to
> certain rules) are interpreted as functions. Here, the formula in the
> second argument to lapply is interpreted as the anonymous function we
> used above:
>
>> fn$lapply(DF, x ~ strapply(x, "(.)/(.)", c, simplify = rbind))
> $var1
> [,1] [,2]
> [1,] "G" "G"
> [2,] "A" "T"
> [3,] "G" "G"
>
> $var2
> [,1] [,2]
> [1,] "C" "T"
> [2,] "C" "C"
> [3,] "A" "A"
>
> On Thu, Jun 3, 2010 at 2:18 PM, karena <dr.jzhou at gmail.com> wrote:
>>
>> I have a data.frame as the following:
>> var1 var2
>> 9G/G09 abd89C/T90
>> 10A/T9 32C/C
>> 90G/G A/A
>> . .
>> . .
>> . .
>> 10T/C 00G/G90
>>
>> What I want is to get the letters which are on the left and right of '/'.
>> for example, for "9G/G09", I only want "G", "G", and for "abd89C/T90", I
>> only want "C" and "T", how to get these?
>>
>> thank you,
>>
>> karena
>> --
>> View this message in context: http://r.789695.n4.nabble.com/string-handling-tp2242119p2242119.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
More information about the R-help
mailing list