[Rd] string-length limitations

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Jul 12 16:38:22 CEST 2006

On Wed, 12 Jul 2006, jake wilson wrote:

> Hi,
> I'm attempting to "glm" a formula - something that's not caused problems in 
> the past.  I've used formulas of the form
>     formula( "dependant-variable~independant-variables" )
> where the independant variable string is of the form:
>     "indvar1+indvar2+...+indvarN"

Why the quotes?: I think that is your problem.

> Now, however, our independant variable strings are quite long (hundreds of 
> variables) - R dies with an "input buffer overflow" error.  

It is normal to use (y ~ ., data=mydata) to avoid such formulae.

>  I've tried writing out the code to files and sourcing them, as well as 
> building the strings incrementally in R, but these have not worked 
> either.  I have come to believe there is a maximum length for char 
> strings - some sort of fundamental limitation.  Is there such a 
> max-length and, if so, is there a way I can work with long strings of 
> the sort referenced above?

The limit is 2^31 -1, not relevant here.

Your message is coming from the parser, and suggests that it is trying to 
parse a piece of text longer than MAXELTSIZE bytes.  The latter depends on 
the platform (unstated: do see the posting guide) and is often 8196 bytes.  
So there is a limit on the length of quoted strings which can be input.

However, what is wrong with say

tmp <- paste(paste("indvar", 1:1000, sep=""), collapse="+")
tmp <- paste("y ~", tmp)
form <- eval(parse(text=tmp))


Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

More information about the R-devel mailing list