[BioC] Integrating hgu133a and hgu133a2

Robert Gentleman rgentlem at fhcrc.org
Wed May 7 17:44:54 CEST 2008


Hi Chintanu,
   Probably the reason it does not get a reply is because the issue has 
been discussed many times and searching the mailing list archive will 
allow you to read all of those discussions and decide what you want to do.
   My view is that you should not do it this way, but rather normalize 
similar things and then use appropriate statistical models to integrate 
the resulting data sets. But others have other opinions and there are 
many suggestions.

   Robert


Chintanu wrote:
> Hi All,
> 
> I'm repeating my earlier post below, as it didn't bring any reply.
> If you have worked in the mixture CDF environment, I would appreciate if you
> could give me some direction, please.
> 
> --------
> I'm trying to combine hgu133a and hgu133a2 for a single downstream analysis.
> It is my first attempt, and have recently come across the idea of the
> mixture CDF environment. As I try to move on down this road considering that
> this is perhaps "the" option I have, eventually I could only sense that
> something is going not in the right direction (as follows).
> 
> Any suggestions and advice would indeed be valuable.
> 
> Thanks.
> 
> Chintanu
> 
> 
> 
>> sessionInfo()
> R version 2.7.0 (2008-04-22)
> i386-pc-mingw32
> 
> locale:
> 
> attached base packages:
> [1] splines   tools     stats     graphics  grDevices utils     datasets
> methods   base
> 
> other attached packages:
>  [1] hgu133acdf_2.2.0     hgu133aprobe_2.2.0   hgu133a2probe_2.2.0
> hgu133a2cdf_2.2.0    matchprobes_1.12.0   MergeMaid_2.12.0     MASS_7.2-41
>  [8] survival_2.34-1      affy_1.18.0          preprocessCore_1.2.0
> affyio_1.8.0         Biobase_2.0.0
> 
> 
>> cdfName(Data_A)                              #  Data_A is ReadAffy () of a
> set of CEL files
> [1] "HG-U133A"
> 
>> cdfName(Data_B)                              #  Data_B is ReadAffy () of
> another set of CEL files
> [1] "HG-U133A_2"
> 
> 
>> comBatch <- combineAffyBatch(list(Data_A, Data_B), c("hgu133aprobe",
> "hgu133a2probe"),
> newcdf = "hgu133aa2")
> package:hgu133aprobe    hgu133aprobe
> package:hgu133a2probe   hgu133a2probe
> 241837 unique probes in common
> 
>> Data_C=comBatch$dat
>> hgu133aa2cdf=comBatch$cdf
> 
>> "@"(Data_C, "cdfName", "hgu133aa2cdf")
> [1] "hgu133aa2"
>> "@"(Data_C, "cdfName")
> [1] "hgu133aa2"
> 
> 
>> class (Data_C)                                # Same for Data_A and Data_B
> [1] "AffyBatch"
> attr(,"package")
> [1] "affy"
> 
> 
> 
>> summary (Data_C)
>   Length     Class      Mode
>       20 AffyBatch        S4
> 
>> summary (Data_B)
>   Length     Class      Mode
>        6 AffyBatch        S4
> 
>> summary (Data_A)
>   Length     Class      Mode
>       14 AffyBatch        S4
> 
> 
>> show (Data_A)
> AffyBatch object
> size of arrays=712x712 features (8 kb)
> cdf=HG-U133A (22283 affyids)
> number of samples=14
> number of genes=22283
> annotation=hgu133a
> notes=
> 
>> show (Data_B)
> AffyBatch object
> size of arrays=732x732 features (8 kb)
> cdf=HG-U133A_2 (22277 affyids)
> number of samples=6
> number of genes=22277
> annotation=hgu133a2
> notes=
> 
> 
>> show (Data_C)
> AffyBatch object
> size of arrays=0x0 features (8 kb)
> cdf=hgu133aa2 (??? affyids)
> number of samples=20
> Error in getCdfInfo(object) :
>  Could not obtain CDF environment, problems encountered:
> Specified environment does not contain hgu133aa2
> Library - package hgu133aa2cdf not installed
> Data for package affy did not contain hgu133aa2cdf
> Bioconductor - hgu133aa2cdf not available
> In addition: Warning message:
> missing cdf environment! in show(AffyBatch)
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list