[Rd] proposed changes to RSiteSearch

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Fri May 8 19:25:29 CEST 2009


hadley wickham wrote:
> On Fri, May 8, 2009 at 10:11 AM, Romain Francois
> <romain.francois at dbmail.com> wrote:
>   
>> strapply in package gsubfn brings elegance here:
>>
>>     
>>> txt <- '<foo>bar</foo>'
>>> rx <- "<(.*?)>(.*?)</(.*?)>"
>>> strapply( txt, rx, c , perl = T )
>>>       
>> [[1]]
>> [1] "foo" "bar" "foo"
>>
>> Too bad you have to pay this on performance:
>>
>>     
>>> txt <- rep( '<foo>bar</foo>', 1000 )
>>> rx <- "<(.*?)>(.*?)</(.*?)>"
>>> system.time( out <- strapply( txt, rx, c , perl = T ) )
>>>       
>>  user  system elapsed
>>  2.923   0.005   3.063
>>     
>>> system.time( out2 <- sapply( paste('\\', 1:3, sep=''), function(x){
>>>       
>> + gsub(rx, x, txt, perl=TRUE)
>> + } ) )
>>  user  system elapsed
>>  0.011   0.000   0.011
>>
>> Not sure what the right play i
>>     
>
> For me:
>
>   
>> system.time( out <- strapply( txt, rx, c , perl = T ) )
>>     
>    user  system elapsed
>   0.004   0.000   0.004
>
>   
>> system.time( out2 <- sapply( paste('\\', 1:3, sep=''), function(x){
>>     
> + gsub(rx, x, txt, perl=TRUE)
> + } ) )
>    user  system elapsed
>       0       0       0
>   

for me:

    txt <- '<foo>bar</foo>'
    rx <- '<(.*?)>(.*?)</(.*?)>'

    library(rbenchmark)
    benchmark(replications=1000, columns=c('test', 'elapsed'),
order='elapsed',
       sapply=sapply(paste('\\', 1:3, sep=''), function(x) gsub(rx, x,
txt, perl=TRUE)),
       mapply=mapply(gsub, rx, paste('\\', 1:3, sep=''), txt, perl=TRUE),
       strapply=strapply(txt, rx, c, perl=TRUE))
    # 2   mapply   0.151
    # 1   sapply   0.166
    # 3 strapply   1.917

vQ



More information about the R-devel mailing list