[R] what is the faster way to search for a pattern in a few million entries data frame ?
fabien.tarrade at gmail.com
Sun Apr 10 21:27:16 CEST 2016
> Didn't you post the same question yesterday? Perhaps nobody answered
> because your question is unanswerable.
sorry, I got a email that my message was waiting for approval and when I
look at the forum I didn't see my message and this is why I sent it
again and this time I did check that the format of my message was text
only. Sorry for the noise.
> You need to describe what the strings are like and what the patterns
> are like if you want advice on speeding things up.
my strings are 1-gram up to 5-grams (sequence of 1 work up to 5 words)
and I am searching for the frequency in my DF of the strings starting
with a sequence of few words.
I guess these days it is standard to use DF with millions of entries so
I was wondering how people are doing that in the faster way.
Dr Fabien Tarrade
Quantitative Analyst/Developer - Data Scientist
Senior data analyst specialised in the modelling, processing and
statistical treatment of data.
PhD in Physics, 10 years of experience as researcher at the forefront of
international scientific research.
Fascinated by finance and data modelling.
Email : <mailto:contact at fabien-tarrade.eu>contact at fabien-tarrade.eu
Phone : <http://www.fabien-tarrade.eu>www.fabien-tarrade.eu
Phone : +33 (0)6 14 78 70 90
LinkedIn <http://ch.linkedin.com/in/fabientarrade/> Twitter
<skype:fabtarhiggs?call> Xing <https://www.xing.com/profile/Fabien_Tarrade>
More information about the R-help