[R] Unexpected interference between dplyr and plm

David Winsemius dwinsemius at comcast.net
Tue Nov 29 18:39:42 CET 2016


> On Nov 29, 2016, at 6:52 AM, Sarah Goslee <sarah.goslee at gmail.com> wrote:
> 
> Hi,
> 
> It shouldn't be entirely unexpected: when I load dplyr, I get a series
> of messages telling me that certain functions are masked.
> 
> 
> The following object is masked from ‘package:plm’:
> 
>    between
> 
> The following objects are masked from ‘package:stats’:
> 
>    filter, lag
> 
> The following objects are masked from ‘package:base’:
> 
>    intersect, setdiff, setequal, union
> 
> 
> You can see the search path that R uses when looking for a function or
> other object here:
> 
> In your example, it should look like this:
> 
>> search()
> [1] ".GlobalEnv"        "package:dplyr"     "package:plm"
> "package:Formula"
> [5] "package:stats"     "package:graphics"  "package:grDevices"
> "package:utils"
> [9] "package:datasets"  "package:vimcom"    "package:setwidth"
> "package:colorout"
> [13] "package:methods"   "Autoloads"         "package:base"
> 
> 
> So R is searching the local environment, then dplyr, and then farther
> down the list, stats, which is where the lag function comes from (see
> above warning).
> 
> Once you know where the desired function comes from you can specify
> its namespace:

The other option would be to load dplyr first (which would give the waring that stats::lag was masked) and then later load plm (which should give a further warning that dplyr::lag is masked). Then the plm::lag function will be found first.

-- 
David.
> 
> 
> summary(plm(y~lagx, data = df, index = c("i", "t")))
> summary(plm(y~stats::lag(x, 1), data = df, index = c("i", "t")))
> 
> If you weren't paying attention to the warning messages at package
> load, you can also use the getAnywhere function to find out:
> 
>> getAnywhere(lag)
> 2 differing objects matching ‘lag’ were found
> in the following places
>  package:dplyr
>  package:stats
>  namespace:dplyr
>  namespace:stats
> 
> 
> Sarah
> 
> 
> On Tue, Nov 29, 2016 at 9:36 AM, Constantin Weiser <weiserc at hhu.de> wrote:
>> Hello,
>> 
>> I'm struggling with an unexpected interference between the two packages
>> dplyr and plm, or to be more concrete with the "lag(x, ...)" function of
>> both packages.
>> 
>> If dplyr is in the namespace the plm function uses no longer the appropriate
>> lag()-function which accounts for the panel structure.
>> 
>> The following code demonstrates the unexpected behaviour:
>> 
>> ## starting from a new R-Session (plm and dplyr unloaded) ##
>> 
>>  ## generate dataset
>>  set.seed(4711)
>>  df <- data.frame(
>>          i = rep(1:10, each = 4),
>>          t = rep(1:4, times = 10),
>>          y = rnorm(40),
>>          x = rnorm(40)
>>  )
>>  ## manually generated laged variable
>>  df$lagx <- c(NA, df$x[-40])
>>  df$lagx[df$t == 1] <- NA
>> 
>> 
>> require(plm)
>> summary(plm(y~lagx, data = df, index = c("i", "t")))
>> summary(plm(y~lag(x, 1), data = df, index = c("i", "t")))
>> # > this result is expected
>> 
>> require(dplyr)
>> summary(plm(y~lagx, data = df, index = c("i", "t")))
>> summary(plm(y~lag(x, 1), data = df, index = c("i", "t")))
>> # > this result is unexpected
>> 
>> Is there a way to force R to use the "correct" lag-function? (or at the
>> devel-level to harmonise both functions)
>> 
>> Thank you very much in advance for your answer
>> 
>> Yours
>> Constantin
>> 
>> --
>> ^
> 
> -- 
> Sarah Goslee
> http://www.functionaldiversity.org
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list