[Rd] string-length limitations

Thomas Lumley tlumley at u.washington.edu
Wed Jul 12 16:37:29 CEST 2006


On Wed, 12 Jul 2006, jake wilson wrote:
> I'm attempting to "glm" a formula - something that's not caused problems in
> the past.  I've used formulas of the form
>
>    formula( "dependant-variable~independant-variables" )
>
> where the independant variable string is of the form:
>
>    "indvar1+indvar2+...+indvarN"
>
> Now, however, our independant variable strings are quite long (hundreds of
> variables) - R dies with an "input buffer overflow" error.  I've tried
> writing out the code to files and sourcing them, as well as building the
> strings incrementally in R, but these have not worked either.  I have come
> to believe there is a maximum length for char strings - some sort of
> fundamental limitation.  Is there such a max-length and, if so, is there a
> way I can work with long strings of the sort referenced above?
>

How long are the strings, and where does the error occur (traceback()) 
will tell you where)?

With
fn <- function(n) formula(paste("y",paste("xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",1:n,collapse="+",sep=""),sep="~"))

I can run terms(fn(500)) with no problems. This is a 15500 character 
string, and produces a terms object over a megabyte in size. This suggests 
that it isn't a string problem, unless you really want formulas larger 
than this.

 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-devel mailing list