[BioC] IRanges: interaction with model fitting
Kasper Daniel Hansen
kasperdanielhansen at gmail.com
Sat Aug 21 21:54:16 CEST 2010
Martin
Thanks a lot for the diagnose, it was really helpful. While I have
not investigated this in detail, this sugests that any model fitting
package without a NAMESPACE will be adversely affected by IRanges.
It seems that a (fairly) standard use of eval is broken by this
(again, assuming no NAMESPACE). And the error message is pretty
arcane and it took me a while to suspect IRanges. And eval is
(correctly I guess) not listed as one of the masked functions when you
load IRanges.
While I think it makes sense to have NAMESPACEs everywhere, this will
probably not be realistic. How important is it to have the function
be called eval? Would it make sense to use some inspiration from
cbind2 and use the name eval2? That might be a more workable fix -
again that depends on how important it is to use the name eval (which
I have not investigated).
Thanks again,
Kasper
On Fri, Aug 20, 2010 at 8:32 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> On 08/20/2010 02:38 PM, Kasper Daniel Hansen wrote:
>> locfit is a package from CRAN for doing local regression (think
>> loess). It does not have a NAMESPACE, and it uses what to my casual
>> eyes look like a pretty standard model setup. But there is a strange
>> thing happening when IRanges is loaded. An example is best:
>>
>> First get some data and define two functions that essentially does the same:
>>
>> library(locfit)
>> utils::data(anorexia, package = "MASS")
>> myData <- anorexia
>> tmp <- function(dat) {
>> anorex.1 <- glm(Postwt ~ lp(Prewt),
>> family = "gaussian", data = dat)
>> }
>> tmp1 <- function(dat1) {
>> tmp2 <- function(dat2) {
>> anorex.1 <- locfit(Postwt ~ lp(Prewt),
>> family = gaussian, data = dat2)
>>
>> }
>> tmp2(dat2 = dat1)
>> }
>>
>> tmp(myData)
>> tmp1(myData)
>>
>> This works great. Now load IRanges and do the same
>>
>> library(IRanges)
>>> tmp(myData)
>> Error in terms.formula(formula, data = data) :
>> 'data' argument is of the wrong type
>>
>> Enter a frame number, or 0 to exit
>>
>> 1: tmp(myData)
>> 2: locfit(Postwt ~ lp(Prewt), family = "gaussian", data = dat)
>> 3: eval(m, sys.frame(sys.parent()))
>> 4: eval(m, sys.frame(sys.parent()))
>> 5: eval(expr, envir, enclos)
>> 6: model.frame(formula = Postwt ~ lp(Prewt), data = dat)
>> 7: model.frame.default(formula = Postwt ~ lp(Prewt), data = dat)
>> 8: terms(formula, data = data)
>> 9: terms.formula(formula, data = data)
>>
>> Selection: 0
>>> tmp1(myData)
>> Error in inherits(x, "data.frame") : object 'dat2' not found
>>
>> Enter a frame number, or 0 to exit
>>
>> 1: tmp1(myData)
>> 2: tmp2(dat2 = dat1)
>> 3: locfit(Postwt ~ lp(Prewt), family = gaussian, data = dat2)
>> 4: eval(m, sys.frame(sys.parent()))
>> 5: eval(m, sys.frame(sys.parent()))
>> 6: eval(expr, envir, enclos)
>> 7: model.frame(formula = Postwt ~ lp(Prewt), data = dat2)
>> 8: model.frame.default(formula = Postwt ~ lp(Prewt), data = dat2)
>> 9: is.data.frame(data)
>> 10: inherits(x, "data.frame")
>>
>> Selection: 0
>>
>> Clearly something seems wrong. The error messages for tmp and tmp1
>> are different. Perhaps the lacking NAMESPACE for locfit makes a
>> difference. And locfit sets up the design matrix using a slightly
>> different set of commands, that to my eye mostly look like an older
>> paradigm.
>>
>> Any idea on what happens and why this is affected by IRanges?
>
> It traces to IRanges' promotion of eval to a generic. A workaround is
>
> library(IRanges)
> eval = base::eval
>
> Creating the generic changes the value of 'envir' (the lazy evaluation
> of sys.frame(sys.parent())) in the call, as shown with
>
>> trace(eval, signature=c("ANY", "ANY"), tracer=quote(print(ls(envir))))
>> tmp1(anorexia) ## 'locfit' envir
> Tracing eval(m, sys.frame(sys.parent())) on entry
> [1] "base" "cens" "data" "formula" "geth" "i" "lfproc"
> [8] "m" "subset" "Terms" "weights" "z"
> Error in inherits(x, "data.frame") : object 'dat2' not found
>
> verus
>
>> eval = base::eval
>> trace(eval, tracer=quote(print(ls(envir))))
> [1] "eval"
>> tmp1(anorexia) ## 'tmp2' envir
> Tracing eval(m, sys.frame(sys.parent())) on entry
> [1] "dat2"
>
> I don't think there is a robust way around this, but maybe others have a
> good idea?
>
> Martin
>>
>> Kasper
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Martin Morgan
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
>
More information about the Bioconductor
mailing list