[R] Memory management in R

jim holtman jholtman at gmail.com
Fri Oct 8 19:30:59 CEST 2010


More specificity: how long is the string, what is the pattern you are
matching against?  It sounds like you might have a complex pattern
that in trying to match the string might be doing a lot of back
tracking and such.  There is an O'Reilly book on Mastering Regular
Expression that might help you understand what might be happening.  So
if you can provide a better example than just the error message, it
would be helpful.

On Fri, Oct 8, 2010 at 1:11 PM, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
> Dear All,
> I am experiencing some problems with a script of mine.
> It crashes with this message
>
> Error in grepl(fut_string, past_string) :
>  invalid regular expression
> '12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12653a6#12
> Calls: entropy_estimate_hash -> total_entropy_lz -> entropy_lz -> grepl
> In addition: Warning message:
> In grepl(fut_string, past_string) : regcomp error:  'Out of memory'
> Execution halted
>
> To make a long story short, I use some functions which eventually call grepl
> on very long strings to check whether a certain substring is part of a
> longer string.
> Now, the script technically works (it never crashes when I run it on a
> smaller dataset) and the problem does not seem to be RAM memory (I have
> several GB of RAM on my machine and its consumption never shoots up so my
> machine never resorts to swap memory).
> So (though I am not an expert) it looks like the problem is some limitation
> of grepl or R memory management.
> Any idea about how I could tackle this problem or how I can profile my code
> to fix it (though it really seems to me that I have to find a way to allow R
> to process longer strings).
> Any suggestion is appreciated.
> Cheers
>
> Lorenzo
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list