[BioC] scan date information
Mark Cowley
m.cowley at garvan.org.au
Tue Sep 22 03:17:20 CEST 2009
Hi,
These scripts work OK for me on OSX (only on TXT CEL files, not the
latest binary ones). I haven't gotten around to writing a version that
uses the Fusion SDK.
Mark
celDate.sh
#!/bin/bash
#
# Determine the date that the CEL file was created, from the CEL file
header
# eg "06/05/08 12:05:36"
#
# Mark Cowley, 2008-07-28
#
grep -m1 -a '^DatHeader' "$@" | egrep -o '[0-9]{2}/[0-9]{2}/[0-9]{2}
[0-9]{2}:[0-9]{2}:[0-9]{2}'
-- or --
celDate.R
# Extract the CEL file creation date stamp from within the CEL file
header.
#
# Mark Cowley, 2008-07-29
celDate <- function(files) {
stopifnot( all(file.exists(files)) )
files <- paste(squote(files), collapse=" ")
cmd <- paste("grep -m1 -a '^DatHeader'", files,
"| egrep -o '[0-9]{2}/[0-9]{2}/[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]
{2}'")
dates <- system(cmd, intern=T)
dates
}
-----------------------------------------------------
Mark Cowley, PhD
Peter Wills Bioinformatics Centre
Garvan Institute of Medical Research, Sydney, Australia
-----------------------------------------------------
On 22/09/2009, at 8:42 AM, Rob Dunne wrote:
> Thanks for that.
>
> I am not using the development version yet but I will look out for
> the new slot.
>
> Saroj, your method doesn't work for me, perhaps your cel file is
> ascii?
> strings SB_20D.CEL | grep DatHeader
>
> However, I have found
> $ grep --text d.a.t.e. SB_20D.CEL
> text/plainaffymetrix-scan-date(2008-04-03T04:45:53Z
>
> the "--text" option makes grep read a binary file as thought it was
> text. I am not sure why I need the dots in date.
>
> Bye
> Rob
>
>
>
> Patrick Aboyoun wrote:
>> Robert,
>> The answer depends on which version of R and BioC are you using. If
>> you are using R <= 2.9, BioC <= 2.4, you will need to devise your
>> own method; one of which was given by Saroj. If you are using R-
>> devel and BioC 2.5 (devel), the eSet abstract class and its derived
>> classes such as ExpressionSet contain a new slot called
>> protocolData that contains an AnnotatedDataFrame object. This slot
>> is to be populated by metadata contained in microarray data files.
>> In BioC 2.5 (devel) the read.affybatch from affy and read.celfile
>> from affyio add a ScanData column to the protocolData slot with the
>> metadata you are looking for.
>> Cheers,
>> Patrick
>> Saroj K Mohapatra wrote:
>>> Hi Rob:
>>>
>>> I have a file called _16.CEL. I want to find out the date
>>> information in its header. The following gives me:
>>>
>>> $ strings _16.CEL | grep DatHeader
>>> DatHeader=[2..65534] _16:CLS=7365 RWS=7365 XIN=1 YIN=1
>>> VE=30 2.0 10/27/06 10:57:45 50207590 M10 I find a date
>>> 10/27/06. Is this what you are looking for?
>>>
>>> Best,
>>>
>>> Saroj
>>>
>>>
>>> Robert Dunne wrote:
>>>> Hi List,
>>>>
>>>> I apologise for what may be a very simple question. How can I
>>>> retrieve
>>>> the scan date information from cel files?
>>>>
>>>> I can find the information using some editors, kate under linux
>>>> shows
>>>> "a f f y m e t r i x - s c a n - d a t e ( 2 0 0 8 - 0 4 - 0 3"
>>>> but I can't find it all all using vi or emacs. I suppose this is
>>>> something to do with encoding.
>>>> Also "string file.cel | grep "d a t e"" does not work.
>>>>
>>>> I have tried the affxparser library but
>>>> readCelHeader("file.cel")
>>>> does not pick up the date.
>>>>
>>>> Unfortunately in many experiments the scan date turns out to be
>>>> the major effect.
>>>>
>>>> Bye
>>>> Rob
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list