[BioC] The metadata in an Affymetrix CDF
Steve Lianoglou
mailinglist.honeypot at gmail.com
Tue May 27 18:34:29 CEST 2008
Hello Bioconductors,
I've made a CDF using the makePlatformDesign package for a drosophila
tiling array and am curious if anyone can point me to somewhere that I
can figure out what some columns of the CDF are referring to.
For instance, the resulting cdf file/package has two columns that I'm
not sure how to interpret.
1) "feature_ID" : This is a vector of ints. It looks like it is meant
to group the perfect match and mismatch pairs together as a "unit."
> head(cdf$feature_ID)
[1] 1 1 535 535 1781830 1781830
Constructing a table of the feature_IDs lists every element in this
vector having a frequency of 2, which just reinforces my suspicion
that it just groups perfect math/mismatch pairs.
2) "feature_set_name" : It also looks like this is performing some
grouping function, although the frequency of the elements in "feature
sets" varies from 1 to 10. Are these feature_sets just "probe
sets" (grouping a bunch of probes to a single transcript), or ... ?
I don't think I need this in my analysis, as I'm reblasting my probes
to get updated coordinates and what not, but I'm just curious as to
what's going on there. I think this info would also be helpful for
other people who are doing different types of analyses.
I apologize if this information is already available elsewhere, but I
haven't run across it.
Thanks,
-steve
More information about the Bioconductor
mailing list