[Rd] Is it a good idea or even possible to redefine attach?

Grant Rettke gcr at wisdomandwonder.com
Sun Aug 10 16:13:55 CEST 2014


Thank you for that pleasant and concise explanation!

I will keep at it.
Grant Rettke | ACM, ASA, FSF, IEEE, SIAM
gcr at wisdomandwonder.com | http://www.wisdomandwonder.com/
“Wisdom begins in wonder.” --Socrates
((λ (x) (x x)) (λ (x) (x x)))
“Life has become immeasurably better since I have been forced to stop
taking it seriously.” --Thompson


On Tue, Aug 5, 2014 at 7:54 PM, Winston Chang <winstonchang1 at gmail.com> wrote:
> On Tue, Aug 5, 2014 at 4:37 PM, Grant Rettke <gcr at wisdomandwonder.com> wrote:
>>
>> That is delightful.
>>
>> When I run it like this:
>> • Start R
>> • Nothing in .Rprofile
>> • Paste in your code
>> ╭────
>> │ gcrenv <- new.env()
>> │ gcrenv$attach.old <- attach
>> │ gcrenv$attach <- function(...){stop("NEVER USE ATTACH")}
>> │ base::attach(gcrenv, name="gcr", warn.conflicts = FALSE)
>> ╰────
>> • I get exactly what is expected, I think
>> ╭────
>> │ search()
>> ╰────
>> ╭────
>> │  [1] ".GlobalEnv"        "gcr"               "ESSR"
>> │  [4] "package:stats"     "package:graphics"  "package:grDevices"
>> │  [7] "package:utils"     "package:datasets"  "package:methods"
>> │ [10] "Autoloads"         "package:base"
>> ╰────
>>
>> Just to be sure:
>> • Is that what is expected?
>> • I am surprised because I thought that `gcr' would come first before
>>   `.GlobalEnv'
>>   • Perhaps I mis understand, as `.GlobalEnv' is actually the "REPL"?
>>
>> My goal is to move that to my .Rprofile so that it is "always run" and I
>> can forget about it more or less.
>>
>> Reading [this] I felt like `.First' would be the right place to put it,
>> but then read further to find that packages are only loaded /after/
>> `.First' has completed.  Curious, I tried it just to be sure. I am now
>> :).
>>
>> This is the .Rprofile file:
>>
>> ╭────
>> │ cat(".Rprofile: Setting CMU repository\n")
>> │ r = getOption("repos")
>> │ r["CRAN"] = "http://lib.stat.cmu.edu/R/CRAN/"
>> │ options(repos = r)
>> │ rm(r)
>>>> │ .First <- function() {
>> │    «same code as above»
>> │ }
>> ╰────
>>
>> (I included the repository load, and understand it should not impact
>> things here)
>>
>> This is run after normal startup of R:
>>
>> ╭────
>> │ search()
>> ╰────
>> ╭────
>> │  [1] ".GlobalEnv"        "package:stats"     "package:graphics"
>> │  [4] "package:grDevices" "package:utils"     "package:datasets"
>> │  [7] "gcr"               "package:methods"   "Autoloads"
>> │ [10] "package:base"
>> ╰────
>>
>> When I read this, I read it as:
>> • My rebind of `attach' occurs
>> • Then all of the packages are loaded and they are referring to
>>   my-rebound `attach'
>> • That is a problem because it *will* break package code
>> • Clearly me putting that code in `.Rprofile' is the wrong place.
>>
>
> That order for search path should actually be fine. To understand why,
> you first have to know the difference between the _binding_
> environment for an object, and the _enclosing_ environment for a
> function.
>
> The binding environment is where you can find an object. For example,
> in the global env, you have a bunch bindings (we often call them
> variables), that point to various objects - vectors, data frames,
> other environments, etc.
>
> The enclosing environment for a function is where the function "runs
> in" when it's called.
>
> Most R objects have just a binding environment (a variable or
> reference that points to the object); functions also have an enclosing
> environment. These two environments aren't necessarily the same.
>
> When you run search(), it shows the set of environments where R will
> look for an object of a given name, when you run stuff at the console
> (and are in the global env). The trick is that, although you can find
> a function (they are bound bound) in one of these _package_
> environments, those functions run in (are enclosed by) a different
> environment: the a corresponding _namespace_ environment.
>
> The way that a namespace environment is set up with the arrangement of
> its ancestor environments, it will find the base namespace version of
> `attach` before it finds yours, even if your personal gcr environment
> comes early in the search path.
>
> =========================
> # Here's an example to illustrate. The `utils::alarm` function calls
> `cat`, which is in base.
>
> alarm
> # function ()
> # {
> #     cat("\a")
> #     flush.console()
> # }
> # <environment: namespace:utils>
>
>
> # Running it makes the screen flash or beep
> alarm()
> # [screen flashes]
>
>
> # We'll put a replacement version of cat early in the search path,
> between utils and base
> my_stuff <- new.env()
> my_stuff$cat <- function(...) stop("Tried to call cat")
> base::attach(my_stuff, pos=length(search()) - 1, name="my_stuff")
>
> search()
> #  [1] ".GlobalEnv"        "tools:rstudio"     "package:stats"
> "package:graphics"
> #  [5] "package:grDevices" "package:utils"     "package:datasets"
> "package:methods"
> #  [9] "my_stuff"          "Autoloads"         "package:base"
>
> # Calling cat from the console gives the error, as expected
> cat()
> # Error in cat() : Tried to call cat
>
> # But when we run alarm(), it still gets the real version of `cat()`,
> # because it finds the the original base namespace version of cat
> # before it finds yours.
> alarm()
> # [screen flashes]
>
> ==========================
>
> You can even alter package environments without affecting the
> corresponding namespace environment. The exception to the package and
> namespace environments being distinct is the base environment; change
> one and you change the other. (I just realized this and have to
> retract my earlier statement about the behavior being different if
> change attach in the base package env vs. the base namespace env.)
>
> -Winston



More information about the R-devel mailing list