[BioC] HT HG-U133+ PM Array Plate: cdf and probe packages discrepancies
James W. MacDonald
jmacdon at med.umich.edu
Fri Jul 10 15:16:06 CEST 2009
Hi Marianne,
The cdf and probe packages we supply are simply re-packaging of the
original Affy data. We don't add or subtract any of the data, so any
discrepancies are due to differences in the data we get from Affy.
There are several chips for which the probe and cdf data are not
consistent, although AFAIK the differences are always control probes so
not critical.
Best,
Jim
Marianne Tuefferd wrote:
> Dear list,
>
> I am trying to analyze Affymetrix HT HG-U133+ PM Array Plate data. I found
> some discrepancies between cdf and probe packages. In fact for some
> probesets, the cdf package contains more information than the probe package.
> These probesets are apparently control ones, but is it still expected? (I
> did not find any difference in HG-U133Plus2 array)
>
> Thanks a lot for your help
>
> Kind regards
>
> Marianne
>
>> sizePSinCDFnotinProbe
>
> AFFX-NONSPECIFICGC10_AT AFFX-NONSPECIFICGC11_AT AFFX-NONSPECIFICGC12_AT
>
> 952 960 973
>
> AFFX-NONSPECIFICGC13_AT AFFX-NONSPECIFICGC14_AT AFFX-NONSPECIFICGC15_AT
>
> 968 960 949
>
> AFFX-NONSPECIFICGC16_AT AFFX-NONSPECIFICGC17_AT AFFX-NONSPECIFICGC18_AT
>
> 963 942 912
>
> AFFX-NONSPECIFICGC19_AT AFFX-NONSPECIFICGC20_AT AFFX-NONSPECIFICGC21_AT
>
> 849 813 697
>
> AFFX-NONSPECIFICGC22_AT AFFX-NONSPECIFICGC23_AT AFFX-NONSPECIFICGC24_AT
>
> 585 407 268
>
> AFFX-NONSPECIFICGC25_AT AFFX-NONSPECIFICGC3_AT AFFX-NONSPECIFICGC4_AT
>
> 9 25 322
>
> AFFX-NONSPECIFICGC5_AT AFFX-NONSPECIFICGC6_AT AFFX-NONSPECIFICGC7_AT
>
> 703 873 914
>
> AFFX-NONSPECIFICGC8_AT AFFX-NONSPECIFICGC9_AT AFFX-R2-TAGA_AT
>
> 940 959 11
>
> AFFX-R2-TAGB_AT AFFX-R2-TAGC_AT AFFX-R2-TAGD_AT
>
> 11 11 11
>
> AFFX-R2-TAGE_AT AFFX-R2-TAGF_AT AFFX-R2-TAGG_AT
>
> 11 11 11
>
> AFFX-R2-TAGH_AT AFFX-R2-TAGIN-3_AT AFFX-R2-TAGIN-5_AT
>
> 11 11 11
>
> AFFX-R2-TAGIN-M_AT AFFX-R2-TAGJ-3_AT AFFX-R2-TAGJ-5_AT
>
> 11 11 11
>
> AFFX-R2-TAGO-3_AT AFFX-R2-TAGO-5_AT AFFX-R2-TAGQ-3_AT
>
> 11 11 11
>
> AFFX-R2-TAGQ-5_AT
>
> 11
>
>> unlist(lapply(PRinfoPSinCDFnotinProbe_spl, nrow))
>
> AFFX-NONSPECIFICGC10_AT AFFX-NONSPECIFICGC11_AT AFFX-NONSPECIFICGC12_AT
>
> 1 1 1
>
> AFFX-NONSPECIFICGC13_AT AFFX-NONSPECIFICGC14_AT AFFX-NONSPECIFICGC15_AT
>
> 1 1 1
>
> AFFX-NONSPECIFICGC16_AT AFFX-NONSPECIFICGC17_AT AFFX-NONSPECIFICGC18_AT
>
> 1 1 1
>
> AFFX-NONSPECIFICGC19_AT AFFX-NONSPECIFICGC20_AT AFFX-NONSPECIFICGC21_AT
>
> 1 1 1
>
> AFFX-NONSPECIFICGC22_AT AFFX-NONSPECIFICGC23_AT AFFX-NONSPECIFICGC24_AT
>
> 1 1 1
>
> AFFX-NONSPECIFICGC25_AT AFFX-NONSPECIFICGC3_AT AFFX-NONSPECIFICGC4_AT
>
> 1 1 1
>
> AFFX-NONSPECIFICGC5_AT AFFX-NONSPECIFICGC6_AT AFFX-NONSPECIFICGC7_AT
>
> 1 1 1
>
> AFFX-NONSPECIFICGC8_AT AFFX-NONSPECIFICGC9_AT AFFX-R2-TAGA_AT
>
> 1 1 1
>
> AFFX-R2-TAGB_AT AFFX-R2-TAGC_AT AFFX-R2-TAGD_AT
>
> 1 1 1
>
> AFFX-R2-TAGE_AT AFFX-R2-TAGF_AT AFFX-R2-TAGG_AT
>
> 1 1 1
>
> AFFX-R2-TAGH_AT AFFX-R2-TAGIN-3_AT AFFX-R2-TAGIN-5_AT
>
> 1 1 1
>
> AFFX-R2-TAGIN-M_AT AFFX-R2-TAGJ-3_AT AFFX-R2-TAGJ-5_AT
>
> 1 1 1
>
> AFFX-R2-TAGO-3_AT AFFX-R2-TAGO-5_AT AFFX-R2-TAGQ-3_AT
>
> 1 1 1
>
> AFFX-R2-TAGQ-5_AT
>
> 1
>
> The corresponding code is below:
>
> library*(*affy*)*
>
> library*(*hthgu133pluspmcdf*)*
>
> library*(*hthgu133pluspmprobe*)*
>
> PSn *<-* ls*(*hthgu133pluspmcdf*)*
>
> PSHT *<-* mget*(*PSn, hthgu133pluspmcdf*)*
>
> names*(*PSHT*)* *<-* toupper*(*names*(*PSHT*))*
>
> cdfInfo *<-* unlist*(*lapply*(*PSHT, *function**(*el*){*el*[*,1*]**}))*
>
> cdfInfo *<-* paste*(*cdfInfo, sub*(*"_AT\\w*$", "_AT", names*(*cdfInfo*))*,
> sep = "."*)*
>
> PSn *<-* toupper*(*PSn*)*
>
> HTprobe *<-* as.data.frame*(*hthgu133pluspmprobe*)*
>
> HTprobe*$*abs *<-* xy2indices*(*HTprobe*$*x, HTprobe*$*y, nr = 744*)*
>
> HTprobe*$*Probe.Set.Name <http://probe.set.name/> *<-* toupper*(*HTprobe*$*
> Probe.Set.Name <http://probe.set.name/>*)*
>
> ProbeInfo *<-* paste*(*HTprobe*$*abs,
> HTprobe*$*Probe.Set.Name<http://probe.set.name/>,
> sep = "."*)*
>
> length*(*unlist*(*lapply*(*PSHT, *function**(*el*){*el*[*,1*]**})))* *==*length
> *(*HTprobe*$*abs*)* ## FLAG!!
>
> length*(*intersect*(*ProbeInfo, cdfInfo*))*
>
> length*(*setdiff*(*ProbeInfo, cdfInfo*))*
>
> length*(*setdiff*(*cdfInfo, ProbeInfo*))*
>
> ## in common 519200 probe absolute positions
>
> PSlocinCDFnotinProbe *<-* setdiff*(*cdfInfo, ProbeInfo*)*
>
> PSinCDFnotinProbe *<-* unique*(*sub*(*"^.*\\.", "", PSlocinCDFnotinProbe*))*
>
> sizePSinCDFnotinProbe *<-* listLen*(*PSHT*[*PSinCDFnotinProbe*]**)*/2
>
> names*(*sizePSinCDFnotinProbe*)* *<-* PSinCDFnotinProbe
>
> PRinfoPSinCDFnotinProbe *<-*
> HTprobe*[*HTprobe*$*Probe.Set.Name<http://probe.set.name/>%in%
> PSinCDFnotinProbe,
> *]*
>
> PRinfoPSinCDFnotinProbe_spl *<-* split*(*PRinfoPSinCDFnotinProbe,
> PRinfoPSinCDFnotinProbe*$*Probe.Set.Name <http://probe.set.name/>*)*
>
> unlist*(*lapply*(*PRinfoPSinCDFnotinProbe_spl, nrow*))*
>
>
>
> PS: my sessionInfo is:
>
>> sessionInfo()
>
> R version 2.9.0 (2009-04-17)
>
> i386-pc-mingw32
>
> locale:
>
> LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252
>
> attached base packages:
>
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
>
> [1] hthgu133pluspmprobe_2.4.0 AnnotationDbi_1.6.1
>
> [3] hthgu133pluspmcdf_2.4.0 affy_1.22.0
>
> [5] Biobase_2.4.1
>
> loaded via a namespace (and not attached):
>
> [1] affyio_1.12.0 DBI_0.2-4 preprocessCore_1.6.0
>
> [4] RSQLite_0.7-1 tools_2.9.0
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
More information about the Bioconductor
mailing list