[Bioc-devel] RFC: xy2i and i2xy in *cdf packages

Seth Falcon sfalcon at fhcrc.org
Wed Apr 11 22:25:22 CEST 2007


Kasper Daniel Hansen <khansen at stat.Berkeley.EDU> writes:

> On Apr 11, 2007, at 11:55 AM, Seth Falcon wrote:
>
>> Kasper Daniel Hansen <khansen at stat.Berkeley.EDU> writes:
>>> How about putting them into a namespace and not export it (that might
>>> be what Jim is thinking of). There is also the little thing that the
>>> chip dimensions are also stored in the AffyBatch objects so now they
>>> will be stored twice... this opens up some consistency things. But
>>> then again, that will likely not be a problem.
>>
>> I think some of the fuss over the name space and masking issues is a
>> bit misguided.  The whole point of name spaces is to allow packages to
>> define symbols with the same name and give users and package
>> developers a nice way to disambiguate.  At some point, we will need to
>> grow up and use these mechanisms.
>>
>> I think should proceed as follows for the upcoming release:
>>
>> 1. Add deprecation warnings to xy2i and i2xy that are defined in the
>>    cdf packages.  The message should tell users to use the functions
>>    available in the affy package instead.
>>
>> 2. Add dimension info to the cdf packages.  This should have been
>>    there in the first place.  To avoid further whining about name
>>    space issues, I propose that we use a special name in the <chip>cdf
>>    environment object.  Something like:
>>
>>        hgu95av2cdf[["CHIP_DIMS"]]
>>
>>    This avoids symbol collision at the package level and it seems
>>    fairly safe to bet that there will not be any probe set IDs named
>>    "CHIP_DIMS".
>
> I would not do this. A major use of the cdf environment is to do what
> is essentially an apply over get(NAME, hgu95avcdf). I would envision
> that the addition of introducing a new "virtual" probeset would break
> a lot of existing code. I know I have quite a bit of existing code
> that would break if I needed to do something like "do this for all
> probesets, but start by removing one which is special".

Very good point.

> What about just putting in a new data object in the CDF package, perhaps
>   hgu95av2dim
> This could be a vector or a list with nrow/ncol components.
>
> Or perhaps to hint at other meta data (why not add stuff like species
> name and so on), and use
>   hgu95av2chipmetadata
> Then we could always subsequently add species names, etc.

Sure.  Since we already have hgu95av2cdf, I think I prefer just adding
what we need for now: hgu95av2dim.

Longer term I think there is a good possibility that we will put the
cdf and probe data together with the annotation into a single
SQLite-based package with a happier OOPy design, etc.

> To add complete confusion I would like to point out that from a
> certain perspective, metadata for the chip should not really be put
> into the CDF package. We already have several CDF packages for a
> given Affy chip, but the metadata is really consistent across these
> packages. A better place would be the probe packages since that info
> is supposed to be probeset-definition independent. Of course, using
> the probe package would be a major change, so I am not sure it is a
> practical solution (also, I believe some chips do not have a probe
> package).

Which may be an argument against putting everything together in the
future... or just something we need to keep in mind.

+ seth

-- 
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
http://bioconductor.org



More information about the Bioc-devel mailing list