[Bioc-devel] RFC: eSet with two color data

Seth Falcon sfalcon at fhcrc.org
Mon Mar 26 16:19:10 CEST 2007


Wolfgang Huber <huber at ebi.ac.uk> writes:

> Thanks, these are good points. Both options are equivalent, it seems
> that would work and if there is a volunteer to implement B that would
> be great.

In short, Martin Morgan has some fairly concrete ideas (even code) for
an option B variant.  We decided it is too close to the release to put
this in now, but will add it to devel as soon as the release branch is
cut (+/- Martin's time to get to it).

> Just to note - computational efficiency is very important, but I don't
> think that this current question is one of the bottlenecks in the
> overall workflows, so an investment here may not bring many returns:
>
> Computing the log-Ratios is an important operation, but typically this
> is done once in the lifetime of a dataset, and perhaps the best way to
> think of this is to have a function logRatio() that takes a two-colour
> ExpressionSet and returns an M-value ExpressionSet (similar for
> log-Product). The computational overhead of doing something like
>
> 	x[,idxGreen] - x[,idxRed]
>
> versus
>
>        x[[1]] -  x[[2]]
>
> once or a few times is not large, compared to many other things we do
> with ExpressionSets and I don't think would be itself justify a lot of
> new infrastructure.

Thinking through option A, it also requires infrastructure because it
breaks the subsetting model in the other direction.  If a user does:

    x[1:3, ]

what should happen? With the current code, they would get something
that would not be a valid n-colour set and probably would not be
desired in this context.  Since both options require some subset
coding, I think going for the option that suggests more efficiency
is best.

+ seth

-- 
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
http://bioconductor.org



More information about the Bioc-devel mailing list