[Bioc-devel] RFC: xy2i and i2xy in *cdf packages
Seth Falcon
sfalcon at fhcrc.org
Wed Apr 11 22:25:22 CEST 2007
Kasper Daniel Hansen <khansen at stat.Berkeley.EDU> writes:
> On Apr 11, 2007, at 11:55 AM, Seth Falcon wrote:
>
>> Kasper Daniel Hansen <khansen at stat.Berkeley.EDU> writes:
>>> How about putting them into a namespace and not export it (that might
>>> be what Jim is thinking of). There is also the little thing that the
>>> chip dimensions are also stored in the AffyBatch objects so now they
>>> will be stored twice... this opens up some consistency things. But
>>> then again, that will likely not be a problem.
>>
>> I think some of the fuss over the name space and masking issues is a
>> bit misguided. The whole point of name spaces is to allow packages to
>> define symbols with the same name and give users and package
>> developers a nice way to disambiguate. At some point, we will need to
>> grow up and use these mechanisms.
>>
>> I think should proceed as follows for the upcoming release:
>>
>> 1. Add deprecation warnings to xy2i and i2xy that are defined in the
>> cdf packages. The message should tell users to use the functions
>> available in the affy package instead.
>>
>> 2. Add dimension info to the cdf packages. This should have been
>> there in the first place. To avoid further whining about name
>> space issues, I propose that we use a special name in the <chip>cdf
>> environment object. Something like:
>>
>> hgu95av2cdf[["CHIP_DIMS"]]
>>
>> This avoids symbol collision at the package level and it seems
>> fairly safe to bet that there will not be any probe set IDs named
>> "CHIP_DIMS".
>
> I would not do this. A major use of the cdf environment is to do what
> is essentially an apply over get(NAME, hgu95avcdf). I would envision
> that the addition of introducing a new "virtual" probeset would break
> a lot of existing code. I know I have quite a bit of existing code
> that would break if I needed to do something like "do this for all
> probesets, but start by removing one which is special".
Very good point.
> What about just putting in a new data object in the CDF package, perhaps
> hgu95av2dim
> This could be a vector or a list with nrow/ncol components.
>
> Or perhaps to hint at other meta data (why not add stuff like species
> name and so on), and use
> hgu95av2chipmetadata
> Then we could always subsequently add species names, etc.
Sure. Since we already have hgu95av2cdf, I think I prefer just adding
what we need for now: hgu95av2dim.
Longer term I think there is a good possibility that we will put the
cdf and probe data together with the annotation into a single
SQLite-based package with a happier OOPy design, etc.
> To add complete confusion I would like to point out that from a
> certain perspective, metadata for the chip should not really be put
> into the CDF package. We already have several CDF packages for a
> given Affy chip, but the metadata is really consistent across these
> packages. A better place would be the probe packages since that info
> is supposed to be probeset-definition independent. Of course, using
> the probe package would be a major change, so I am not sure it is a
> practical solution (also, I believe some chips do not have a probe
> package).
Which may be an argument against putting everything together in the
future... or just something we need to keep in mind.
+ seth
--
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
http://bioconductor.org
More information about the Bioc-devel
mailing list