[R] a question of substitute

Adrian Dusa dusa.adrian at gmail.com
Thu Jan 11 09:39:33 CET 2007


Dear Prof. Ripley,

Thank you for this extensive explanation. It looks like my first solution is 
similar to (b): creating new variables inside the wrapper (and new data if 
not missing).
This course is only introductory, with simple models, and I do point students 
to each test separately if they want more complicated things.

I'm looking forward to the release of the 2.5.0 version.
Best regards,
Adrian


On Thursday 11 January 2007 03:08, Prof Brian Ripley wrote:
> The 'Right Thing' is for oneway.test() to allow a variable for the first
> argument, and I have altered it in R-patched and R-devel to do so. So if
> your students can make use of R-patched that would be the best solution.
> If not, perhaps you could make a copy of oneway.test from R-patched
> available to them.  Normally I would worry about namespace issues, but it
> seems unlikely they would matter here: if they did assignInNamespace is
> likely to work to insert the fix.
>
> Grothendieck's suggestions are steps towards a morass: they may work in
> simple cases but can make more complicated ones worse (such as looking for
> 'data' in the wrong place).  These model fitting functions have rather
> precise requirements for where they look for their components:
>
>  	'data'
>  	the environment of 'formula'
>  	the environment of the caller
>
> and that includes where they look for 'data'.  It is easy to use
> substitute or such to make a literal formula out of 'formula', but doing
> so changes its environment.  So one needs to either
>
> (a) fix up an environment within which to evaluate the modified call that
> emulates the scoping rules or
>
> (b) create a new 'data' that has references to all the variables needed,
> and just call the function with the new 'formula' and new 'data'.
>
> At first sight model.frame() looks the way to do (b), but it is not, since
> if there are function calls in the formula (e.g. log()) the model frame
> includes the derived variables and not the original ones.  There are
> workarounds (e.g. in glmmPQL), like using all.vars, creating a formula
> from that, setting its environment to that of the original function and
> then calling model.frame.
>
> This comes up often enough that I have contemplated adding a solution to
> (b) to the stats package.
>
> Doing either of these right is really pretty complicated, and not
> something to dash off code in a fairly quick reply (or even to check that
> the code in glmmPQL was general enough to be applicable).

-- 
Adrian Dusa
Romanian Social Data Archive
1, Schitu Magureanu Bd
050025 Bucharest sector 5
Romania
Tel./Fax: +40 21 3126618 \
          +40 21 3120210 / int.101



More information about the R-help mailing list