[BioC] The metadata in an Affymetrix CDF

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue May 27 18:34:29 CEST 2008


Hello Bioconductors,

I've made a CDF using the makePlatformDesign package for a drosophila  
tiling array and am curious if anyone can point me to somewhere that I  
can figure out what some columns of the CDF are referring to.

For instance, the resulting cdf file/package has two columns that I'm  
not sure how to interpret.

1) "feature_ID" : This is a vector of ints. It looks like it is meant  
to group the perfect match and mismatch pairs together as a "unit."

 > head(cdf$feature_ID)
[1]       1       1     535     535 1781830 1781830

Constructing a table of the feature_IDs lists every element in this  
vector having a frequency of 2, which just reinforces my suspicion  
that it just groups perfect math/mismatch pairs.

2) "feature_set_name" : It also looks like this is performing some  
grouping function, although the frequency of the elements in "feature  
sets" varies from 1 to 10. Are these feature_sets just "probe  
sets" (grouping a bunch of probes to a single transcript), or ... ?

I don't think I need this in my analysis, as I'm reblasting my probes  
to get updated coordinates and what not, but I'm just curious as to  
what's going on there. I think this info would also be helpful for  
other people who are doing different types of analyses.

I apologize if this information is already available elsewhere, but I  
haven't run across it.

Thanks,
-steve



More information about the Bioconductor mailing list