[Bioc-devel] [Fwd: Re: ifnotfound in mget]

Francois Pepin fpepin at cs.mcgill.ca
Fri Sep 1 21:06:22 CEST 2006


> > is there a particular reason why there is no "ifnotfound = NA" arguments
> > in most bioconductor functions regarding the chip annotations
> > (findLargest in Category being one example)?
> 
> Isn't findLargest in the genefilter package?  Anyhow, I'll venture an
> answer regarding ifnotfound behavior.

yes, I also work on R 2.1 where it was still in Category.

> > For example, hgug4112a (human whole genome array) and has 43931
> > features, while the annotation package knows about 41000. The missing
> > ones include the control probes as well as some truly obscure probes
> > that are almost unannotatable.

As a note, I was mistaken and the 41k in the annotation package is
indeed correct. There are 41000 unique non-control probes on the chip.
So the issue here is only with control probes and not nearly as serious
as I thought.

> While frustrating, doing careful filtering is an important step of
> many analysis.  And for findLargest, I guess you are asking for
> 'na.rm', not ifnotfound.

Yes, filtering is essential for many analyses. Not doing any can be a
useful sanity check in many cases, for example in differential
expression or looking for batch effects.

The issue with the ifnotfound is it will call stop if an unknown probe
is put in the mix:
> args(mget)
function (x, envir, mode = "any", ifnotfound = list(function(x) stop
(paste("value for '",x, "' not found", sep = ""), call. = FALSE)),
inherits = FALSE)

findLargest is dealing with probes with no Entrez Gene matching quite
well. I do not see how adding a na.rm would work in this case.

I would like if the control were treated as probes that do not map to
anything instead of evil things that make code crash.

I'd agree its probably not very high in the list of priorities, as the
workaround is pretty trivial (ie filter control probes).

Francois



More information about the Bioc-devel mailing list