[R] premature evaluation of symbols. Is that the way to describe this problem?

Paul Johnson pauljohn32 at gmail.com
Thu Apr 10 16:59:47 CEST 2014


Dear eveRybody

In the package rockchalk, I have functions that take regressions and
make tables and plots.  Sometimes I'll combine a model with various
arguments and pass the resulting list around to be executed, usually
using do.call.  While debugging a function on a particularly large
dataset, I noticed that I've caused inconvenience for myself. I get
the correct calculations, but the arguments into functions are not
symbols when the function call happens. They are fully "written out".

So the debugger does not see function(x), it sees
function(c(1,3,1,3,45,2,4,2....).  Debugging begins with a
comprehensive listing of the elements in every object, which freezes
Emacs/ESS and generally ruins my quality of life.

I don't know the right words for this, but it seems to me object names
are parsed and evaluated before they need to be. Function calls
include the evaluated arguments in them, not just the symbols for
them.  I have some guesses on fixes, wonder what you think. And I
wonder if fixing this problem might generally make my functions faster
and more efficient because I'm not passing gigantic collections of
numbers through the function call.  I know, I don't have all the right
words here.

I've built a toy example that will illustrate the problem, you tell me
the words for it.

## Paul Johnson 2014-04-10
dat <- data.frame(x = rnorm(50),y = rnorm(50))
m1 <- lm(y ~ x, dat)

myRegFit <- function(model, nd) predict(model, nd)

mySpecialFeature <- function(model, ci){
    pargs <- list(model, model.frame(model)[1:3, ])
    res <- do.call(myRegFit, pargs)
    print(res)
}

mySpecialFeature (m1)

debug(myRegFit)

mySpecialFeature (m1)

Note when the debugger enters, it has the whole model's structure
"splatted" into the function call.

> dat <- data.frame(x = rnorm(50),y = rnorm(50))
> m1 <- lm(y ~ x, dat)
> myRegFit <-function(model, nd) predict(model, nd)
> mySpecialFeature <- function(model, ci){
+     pargs <- list(model, model.frame(model)[1:3, ])
+     res <- do.call(myRegFit, pargs)
+     print(res)
+ }
> mySpecialFeature (m1)
          1           2           3
-0.04755431  0.35162844 -0.11715522
> debug(myRegFit)
> mySpecialFeature (m1)
debugging in: (function (model, nd)
predict(model, nd))(list(coefficients = c(0.0636305741709566,
-0.177786836929453), residuals = c(-0.0152803151982162, -0.885875162659858,
-1.23645319405006, -1.77900639571943, -1.9952045397527, 1.38150266407176,
-2.27403449262599, 0.0367524776530579, -0.881037818467492, -1.10816713568432,
-0.55749829201921, -0.372526253742828, -0.353208893775679, 0.531708523456415,
-0.43187124865558, 1.03973431972897, 0.849170115617157, 1.11227803262189,
0.47216440383252, 0.920060697785203, -0.374672861268964, 2.94683565121636,
0.514112041811711, -0.52321362055969, -0.0412387814196237, 0.983863448669766,
0.534230127442599, -0.869960511196742, 1.90586406082412, -1.84705932449576,
0.806425475391075, 1.90939977897903, 0.41030042787483, 0.994503041407507,
0.715719209301158, -0.538096591457249, -0.482411304681239, 0.0323998214753804,
0.551162374882342, -0.618989357027834, 1.08996565055366, -0.697423620816604,
1.38170655971013, 1.55752893685726, -0.0929258405664267, -1.00210610433922,
-1.51879925258188, -1.57050250989563, -1.06868502360026, 0.458860605094578
), effects = c(-0.661274203468574, -1.07255577360914, -1.36123824096605,
-1.71875465303308, -1.8806486154155, 1.38636416232103, -2.29259163100096,
0.153263315278269, -0.950879523052079, -0.963705724647863, -0.62175976245114,
-0.423965256680951, -0.350662885659068, 0.469175412149025, -0.400505083448679,
1.03440116252973, 0.878739280288923, 1.16574672001397, 0.358222935858666,
1.00514946836967, -0.316592303881481, 2.8507611924072, 0.573209391002668,
-0.393720180068215, 0.0971873363200073, 1.23818281311352, 0.449576129222722,
-0.929511151618747, 1.97922180250824, -1.80820009744905, 0.877996855335966,
1.86871623414376, 0.226023354471842, 0.814892815951223, 0.821400980265047,
-0.536299037896556, -0.358703204255386, 0.105714598012197, 0.543301738010905,
-0.643659132172249, 1.26412624281219, -0.808498804261978, 1.32273956796476,
1.57655585529458, 0.022266343917185, -1.20958321975888, -1.52288584310647,
-1.60904879400386, -1.08384090772898, 0.567729611277337), rank = 2L,
    fitted.values = c(-0.0475543078742839, 0.351628437239565,
    -0.117155221544445, 0.165041007767092, 0.247859323684996,
    0.0805663573239046, 0.0448510221995737, 0.250840726246576,
    -0.0333621333428176, 0.293467635417987, -0.0248518201238109,
    -0.00529650906918994, 0.0770350456001708, -0.0222159311803766,
    0.120988140963125, 0.0650186744394092, 0.118247568257686,
    0.154696293986828, -0.100617877288265, 0.202919505525394,
    0.161729772710581, -0.0733692278375946, 0.163280463324874,
    0.270640256833884, 0.284263319806951, 0.461009994308065,
    -0.0559520919143943, -0.0176674192887905, 0.185028727541523,
    0.132415672762266, 0.182304379869478, 0.011106443703975,
    -0.207885424899216, -0.200768100305762, 0.234325514404888,
    0.0758935912226656, 0.261817140331961, 0.184963202179859,
    0.0611640613395693, 0.0355287514706238, 0.338761314434586,
    -0.0962465590881959, -0.0167773073486601, 0.102169781012095,
    0.248829672417071, -0.243267385378769, 0.0669197905143708,
    0.0143659410153998, 0.0500382129484238, 0.239186308642765
    ), assign = 0:1, qr = list(qr = c(-7.07106781186548, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    0.14142135623731, 0.14142135623731, 0.14142135623731, 0.14142135623731,
    1.18871623033411, 6.0328188078105, -0.18012556382541, 0.0829807810893617,
    0.160196640586552, 0.00422063404457838, -0.0290786460517367,
    0.162976358560143, -0.102000873004284, 0.202719661631434,
    -0.0940662616503503, -0.0758338195379766, 0.000928206918060759,
    -0.0916086841440873, 0.0419079829301299, -0.0102752861369684,
    0.0393528032622073, 0.0733358618834786, -0.164706930442335,
    0.118296891160581, 0.0798935429820433, -0.139301585451296,
    0.0813393331707195, 0.181436499351408, 0.194137995449112,
    0.358928189908056, -0.123062676154681, -0.0873678679520487,
    0.101616380529727, 0.0525624701654998, 0.0990763283113914,
    -0.0605404863829379, -0.264718090934247, -0.258082235933901,
    0.147578360387876, -0.000136030863872082, 0.173210245091236,
    0.101555287798443, -0.0138691440925964, -0.0377702879768747,
    0.244949334093302, -0.160631321222164, -0.0865379699066212,
    0.0243626389824618, 0.161101347601284, -0.297706548364757,
    -0.0085027759125684, -0.0575014860888491, -0.0242423560643218,
    0.152110333789614), qraux = c(1.14142135623731, 1.25694602752555
    ), pivot = 1:2, tol = 1e-07, rank = 2L), df.residual = 48L,
    xlevels = list(), call = lm(formula = y ~ x, data = dat),
    terms = y ~ x, model = list(y = c(-0.0628346230725001, -0.534246725420292,
    -1.3536084155945, -1.61396538795233, -1.7473452160677, 1.46206902139567,
    -2.22918347042642, 0.287593203899634, -0.91439995181031,
    -0.81469950026633, -0.582350112143021, -0.377822762812018,
    -0.276173848175508, 0.509492592276038, -0.310883107692455,
    1.10475299416838, 0.967417683874843, 1.26697432660872, 0.371546526544255,
    1.1229802033106, -0.212943088558383, 2.87346642337877, 0.677392505136584,
    -0.252573363725805, 0.243024538387328, 1.44487344297783,
    0.478278035528205, -0.887627930485533, 2.09089278836564,
    -1.71464365173349, 0.988729855260553, 1.920506222683, 0.202415002975614,
    0.793734941101744, 0.950044723706047, -0.462203000234584,
    -0.220594164349278, 0.21736302365524, 0.612326436221911,
    -0.58346060555721, 1.42872696498824, -0.793670179904799,
    1.36492925236147, 1.65969871786935, 0.155903831850644, -1.24537348971799,
    -1.45187946206751, -1.55613656888023, -1.01864681065184,
    0.698046913737343), x = c(0.6253830934028, -1.6199054330602,
    1.01686828360155, -0.570404622454562, -1.03623391189047,
    -0.0952589260568722, 0.105629597194727, -1.05300344676195,
    0.545556179461482, -1.29276759301495, 0.497688106852802,
    0.387695087164958, -0.0753963097646713, 0.482861987051367,
    -0.322619873230144, -0.00780766614911678, -0.307204937272165,
    -0.512218572469498, 0.9238504621374, -0.783460315510917,
    -0.551779874336542, 0.770584619056546, -0.560502064578936,
    -1.16437013132232, -1.24099595586789, -2.23514534034255,
    0.672618221633596, 0.457277911367569, -0.682829817253222,
    -0.38689646421126, -0.667506142457625, 0.295433179273129,
    1.52719967214396, 1.48716676129197, -0.960110113785672, -0.0689759560578443,
    -1.11474262990373, -0.682461255874912, 0.0138734277181953,
    0.158064698071454, -1.54753155529058, 0.899263050180665,
    0.452271286830549, -0.216771992273148, -1.0416918453845,
    1.7262130585714, -0.0185008991679137, 0.277099441142009,
    0.0764531359986249, -0.987450688160168))), list(y = c(-0.0628346230725001,
-0.534246725420292, -1.3536084155945), x = c(0.6253830934028,
-1.6199054330602, 1.01686828360155)))
debug: predict(model, nd)
Browse[2]>


I wish the debug output would look more like R's own lm, Note the
contents of "dat" are not splatted into the middle of the function
call.

> m1 <- lm(y ~ x, dat)
debugging in: lm(y ~ x, dat)
debug: {
    ret.x <- x
    ret.y <- y
    cl <- match.call()
    mf <- match.call(expand.dots = FALSE)
    m <- match(c("formula", "data", "subset",


I've been reading quite a while on this question, testing lots of
ideas. The quote function seems to work, but I worry about how the R
runtime environment finds all the pieces if I pass them through this
way.

mySpecialFeature <- function(model, ci){
    pargs <- list(quote(model), quote(model.frame(model)[1:3, ]))
    res <- do.call(myRegFit, pargs)
    print(res)
}

See, that fixes it:

> mySpecialFeature (m1)
debugging in: (function (model, nd)
predict(model, nd))(model, model.frame(model)[1:3, ])
debug: predict(model, nd)
Browse[2]> c
exiting from: (function (model, nd)
predict(model, nd))(model, model.frame(model)[1:3, ])
          1           2           3
-0.04755431  0.35162844 -0.11715522
>

Are there dangers in this I don't know about?




-- 
Paul E. Johnson
Professor, Political Science      Assoc. Director
1541 Lilac Lane, Room 504      Center for Research Methods
University of Kansas                 University of Kansas
http://pj.freefaculty.org               http://quant.ku.edu




More information about the R-help mailing list