[R] Unexpected interference between dplyr and plm
David Winsemius
dwinsemius at comcast.net
Tue Nov 29 18:39:42 CET 2016
> On Nov 29, 2016, at 6:52 AM, Sarah Goslee <sarah.goslee at gmail.com> wrote:
>
> Hi,
>
> It shouldn't be entirely unexpected: when I load dplyr, I get a series
> of messages telling me that certain functions are masked.
>
>
> The following object is masked from ‘package:plm’:
>
> between
>
> The following objects are masked from ‘package:stats’:
>
> filter, lag
>
> The following objects are masked from ‘package:base’:
>
> intersect, setdiff, setequal, union
>
>
> You can see the search path that R uses when looking for a function or
> other object here:
>
> In your example, it should look like this:
>
>> search()
> [1] ".GlobalEnv" "package:dplyr" "package:plm"
> "package:Formula"
> [5] "package:stats" "package:graphics" "package:grDevices"
> "package:utils"
> [9] "package:datasets" "package:vimcom" "package:setwidth"
> "package:colorout"
> [13] "package:methods" "Autoloads" "package:base"
>
>
> So R is searching the local environment, then dplyr, and then farther
> down the list, stats, which is where the lag function comes from (see
> above warning).
>
> Once you know where the desired function comes from you can specify
> its namespace:
The other option would be to load dplyr first (which would give the waring that stats::lag was masked) and then later load plm (which should give a further warning that dplyr::lag is masked). Then the plm::lag function will be found first.
--
David.
>
>
> summary(plm(y~lagx, data = df, index = c("i", "t")))
> summary(plm(y~stats::lag(x, 1), data = df, index = c("i", "t")))
>
> If you weren't paying attention to the warning messages at package
> load, you can also use the getAnywhere function to find out:
>
>> getAnywhere(lag)
> 2 differing objects matching ‘lag’ were found
> in the following places
> package:dplyr
> package:stats
> namespace:dplyr
> namespace:stats
>
>
> Sarah
>
>
> On Tue, Nov 29, 2016 at 9:36 AM, Constantin Weiser <weiserc at hhu.de> wrote:
>> Hello,
>>
>> I'm struggling with an unexpected interference between the two packages
>> dplyr and plm, or to be more concrete with the "lag(x, ...)" function of
>> both packages.
>>
>> If dplyr is in the namespace the plm function uses no longer the appropriate
>> lag()-function which accounts for the panel structure.
>>
>> The following code demonstrates the unexpected behaviour:
>>
>> ## starting from a new R-Session (plm and dplyr unloaded) ##
>>
>> ## generate dataset
>> set.seed(4711)
>> df <- data.frame(
>> i = rep(1:10, each = 4),
>> t = rep(1:4, times = 10),
>> y = rnorm(40),
>> x = rnorm(40)
>> )
>> ## manually generated laged variable
>> df$lagx <- c(NA, df$x[-40])
>> df$lagx[df$t == 1] <- NA
>>
>>
>> require(plm)
>> summary(plm(y~lagx, data = df, index = c("i", "t")))
>> summary(plm(y~lag(x, 1), data = df, index = c("i", "t")))
>> # > this result is expected
>>
>> require(dplyr)
>> summary(plm(y~lagx, data = df, index = c("i", "t")))
>> summary(plm(y~lag(x, 1), data = df, index = c("i", "t")))
>> # > this result is unexpected
>>
>> Is there a way to force R to use the "correct" lag-function? (or at the
>> devel-level to harmonise both functions)
>>
>> Thank you very much in advance for your answer
>>
>> Yours
>> Constantin
>>
>> --
>> ^
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list