[Bioc-devel] RFC: eSet with two color data
Seth Falcon
sfalcon at fhcrc.org
Mon Mar 26 16:19:10 CEST 2007
Wolfgang Huber <huber at ebi.ac.uk> writes:
> Thanks, these are good points. Both options are equivalent, it seems
> that would work and if there is a volunteer to implement B that would
> be great.
In short, Martin Morgan has some fairly concrete ideas (even code) for
an option B variant. We decided it is too close to the release to put
this in now, but will add it to devel as soon as the release branch is
cut (+/- Martin's time to get to it).
> Just to note - computational efficiency is very important, but I don't
> think that this current question is one of the bottlenecks in the
> overall workflows, so an investment here may not bring many returns:
>
> Computing the log-Ratios is an important operation, but typically this
> is done once in the lifetime of a dataset, and perhaps the best way to
> think of this is to have a function logRatio() that takes a two-colour
> ExpressionSet and returns an M-value ExpressionSet (similar for
> log-Product). The computational overhead of doing something like
>
> x[,idxGreen] - x[,idxRed]
>
> versus
>
> x[[1]] - x[[2]]
>
> once or a few times is not large, compared to many other things we do
> with ExpressionSets and I don't think would be itself justify a lot of
> new infrastructure.
Thinking through option A, it also requires infrastructure because it
breaks the subsetting model in the other direction. If a user does:
x[1:3, ]
what should happen? With the current code, they would get something
that would not be a valid n-colour set and probably would not be
desired in this context. Since both options require some subset
coding, I think going for the option that suggests more efficiency
is best.
+ seth
--
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
http://bioconductor.org
More information about the Bioc-devel
mailing list