[Bioc-devel] BioC 2.5: Added scanDates slot to Biobase's eSet class

Patrick Aboyoun paboyoun at fhcrc.org
Thu Jun 18 09:11:24 CEST 2009


Laurent,
We had some immediate need for scan date information and rather than  
overbuild a system for managing metadata that we may or may not need,  
we opted to start simply and then build up as appropriate. There has  
been some internal discussions about managing other metadata along  
with scan dates, but nothing else has bubbled to the top yet. Your  
thoughts and design can help speed up this process. The class  
versioning system in Biobase supports iterative development and we can  
make further changes once we lock a design in place. One editorial  
comment I have is that lots of designs are possible for a given need  
and, for example, the current class properly subsets the scanDates  
information using "[" despite not being stored in the phenoData  
(AnnotatedDataFrame) slot.


Cheers,
Patrick


Quoting Laurent Gautier <laurent at cbs.dtu.dk>:

> Hi Patrick,
>
> Storing the scan dates is indeed useful information, and is it nice to
> have it offered at the parsing stage.
> However, first comment would be "does it justify a new slot" to eSet ?
>
> I have been storing scan dates for quite some time now, but opted for
> having them in the phenoData as it made more sense to me, both on an
> implementation standpoint and on practical standpoint (as standard
> extraction of an eset-subset on columns with the "[" operator works).
>
> If having something specific for scan dates is really really wished,
> would it make make sense to have that by extending AnnotatedDataFrame ?
>
> In my opinion, the stage at which the the data are extracted (in that
> case when parsing the files coming out of the image analysis) should
> not dictate where the data are stored.
> In fact, it might make it for a nice(r) workflow if the function
> reading raw array data could return an eSet-inheriting instance and a
> phenoData with information such as dates and file names. I am working
> on a workflow that is in fact getting much more data from the header (I
> suppose that I'd contribute it when enough time to wrap it up).
>
>
> Just few thoughts,
>
>
>
> L.
>
>
>
>
>
> Patrick Aboyoun wrote:
>> Dear Bioconductor developers,
>> The Biocore group has just committed a change to the BioC 2.5 code   
>> line (Biobase version 2.5.3) to support the use of microarray scan   
>> date in statistical analyses by adding a scanDates slot to   
>> Biobase's eSet class. This information can be retrieved and set   
>> using the new scanDates and scanDates<- function respectively. The   
>> scanDates slot is designed to hold a character vector of length = #  
>>  of samples, with one character element for each sample. (See   
>> help(scanDates) for more information.)
>>
>> In this first round of check-ins we have added affy support of this  
>>  new slot to functions like ReadAffy and we will be working towards  
>>  adding this information to other microarray platforms as well.
>>
>> This change involved bumping the eSet version number from 1.1.0 to   
>> 1.2.0 in the Biobase class definition. In order to minimize the   
>> impact of this change, the Biobase methods support both the current  
>>  eSet version 1.2.0 as well as old 1.1.0 serialized objects so   
>> updateObject will not be required to be performed on eSet-derived   
>> objects prior to use in other functions. We have also tested and   
>> versioned bumped (and patched where needed) the following packages   
>> that create eSet-derived classes to minimize any package build   
>> issues: ACME, beadarray, beadarraySNP, cellHTS2, CGHbase, codelink,  
>>  crlmm, GeneRegionScan, GGBase, maDB, oligoClasses, ontoTools,  
>> puma,  rMAT, SNPchip, and spkTools.
>>
>> Below is a demonstration of the new functionality. If you encounter  
>>  any issues related to this change, please e-mail this list so the   
>> community can monitor the change.
>>
>> - The Biocore Team
>>
>>
>>> suppressMessages(library(affy))
>>> example(ReadAffy)
>>
>> RdAffy> if(require(affydata)){
>> RdAffy+      celpath <- system.file("celfiles", package="affydata")
>> RdAffy+      fns <- list.celfiles(path=celpath,full.names=TRUE)
>> RdAffy+  RdAffy+      cat("Reading files:\n",paste(fns,collapse="\n"),"\n")
>> RdAffy+      ##read a binary celfile
>> RdAffy+      abatch <- ReadAffy(filenames=fns[1])
>> RdAffy+      ##read a text celfile
>> RdAffy+      abatch <- ReadAffy(filenames=fns[2])
>> RdAffy+      ##read all files in that dir
>> RdAffy+      abatch <- ReadAffy(celfile.path=celpath)
>> RdAffy+ }
>> Loading required package: affydata
>> Reading files:
>> /Library/Frameworks/R.framework/Versions/2.10/Resources/library/affydata/celfiles/binary.cel   
>> /Library/Frameworks/R.framework/Versions/2.10/Resources/library/affydata/celfiles/text.cel
>>> scanDates(abatch)
>>        binary.cel            text.cel
>> "01/23/04 14:30:57" "08/29/03 15:12:30"
>>> sessionInfo()
>> R version 2.10.0 Under development (unstable) (2009-06-12 r48755)
>> i386-apple-darwin9.6.0
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> other attached packages:
>> [1] affydata_1.11.6 affy_1.23.2     Biobase_2.5.3
>> loaded via a namespace (and not attached):
>> [1] affyio_1.13.3        preprocessCore_1.7.4 tools_2.10.0
>>
>> _______________________________________________
>> Bioc-devel at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list