[R] what is the faster way to search for a pattern in a few million entries data frame ?

Bert Gunter bgunter.4567 at gmail.com
Mon Apr 11 00:41:10 CEST 2016


Fabien:

I was unable to make any sense of your latest response (maybe I'm just
dense). If others have similar difficulties, and you fail to get a
satisfactory response, I suggest that you read and follow the posting
guide's request for a **small, reproducible example* (perhaps the
first few dozen rows of your data frame) in which you show the code
you tried and your desired result.


Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Apr 10, 2016 at 12:27 PM, Fabien Tarrade
<fabien.tarrade at gmail.com> wrote:
> Hi Duncan,
>>
>> Didn't you post the same question yesterday?  Perhaps nobody answered
>> because your question is unanswerable.
>
> sorry, I got a email that my message was waiting for approval and when I
> look at the forum I didn't see my message and this is why  I sent it again
> and this time I did check that the format of my message was text only. Sorry
> for the noise.
>>
>> You need to describe what the strings are like and what the patterns are
>> like if you want advice on speeding things up.
>
> my strings are 1-gram up to 5-grams (sequence of 1 work up to 5 words) and I
> am searching for the frequency in my DF of the strings starting with a
> sequence of few words.
>
> I guess these days it is standard to use DF with millions of entries so I
> was wondering how people are doing that in the faster way.
>
> Thanks
> Cheers
> Fabien
>
> --
> Dr Fabien Tarrade
>
> Quantitative Analyst/Developer - Data Scientist
>
> Senior data analyst specialised in the modelling, processing and statistical
> treatment of data.
> PhD in Physics, 10 years of experience as researcher at the forefront of
> international scientific research.
> Fascinated by finance and data modelling.
>
> Geneva, Switzerland
>
> Email : <mailto:contact at fabien-tarrade.eu>contact at fabien-tarrade.eu
> Phone : <http://www.fabien-tarrade.eu>www.fabien-tarrade.eu
> Phone : +33 (0)6 14 78 70 90
>
> LinkedIn <http://ch.linkedin.com/in/fabientarrade/> Twitter
> <https://twitter.com/fabtar> Google
> <https://plus.google.com/+FabienTarradeProfile/posts> Facebook
> <https://www.facebook.com/fabien.tarrade.eu> Google <skype:fabtarhiggs?call>
> Xing <https://www.xing.com/profile/Fabien_Tarrade>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list