[BioC] Affy's .DTT format?
Kasper Daniel Hansen
khansen at stat.Berkeley.EDU
Tue May 23 07:27:37 CEST 2006
On May 22, 2006, at 4:30 PM, Kasper Daniel Hansen wrote:
> On May 22, 2006, at 8:56 AM, Jenny Drnevich wrote:
>
>> Hi everyone,
>>
>> Our core facility was told by our Affy rep they should switch to
>> the .DTT
>> flat archive file instead of the CAB format as the way to store
>> data and to
>> disseminate it to researchers. However, this file format is basically
>> useless right now without Affy's Data Transfer Tool and GCOS
>> software; to
>> get the .CEL files, I had to use the Data Transfer Tool to import
>> the .DTT
>> file into my local GCOS database, then use it again to export the
>> flat .CEL
>> files. I've talked with our core, and instead they are going to
>> give out
>> the .CEL, .CHP and .DAT flat files, at least for now. I noticed
>> that Affy
>> does have a DTT SDK, and wondered if anyone within Bioconductor was
>> planning on/working on being to import data from the .DTT format?
>> I'm not
>> sure what our rep was thinking, because I don't think any analysis
>> software
>> out there can take .DTT files (yet?).
>
> Hi Jenny
>
> The SDK we are interfacing to in affxparser contains methods for
> parsing DTT files. I did not really know of this format until this
> morning, and I am still trying to figure out whether it really makes
> sense to support. Basically it seems like a way to bundle cel, chp
> and dat files, and stuff I have read on the Affy website seems to
> indicate that it primarily makes sense for people wanting to share
> _all_ data and subsequently use it in some of Affy's own software.
> Eg. they provide both "flat" cel files and dtt-bundled cel files for
> some of their SNP data. It also seems like there exists both a dtt
> format and a "flat" dtt format - whatever the difference is. I will
> try to investigate a bit (and if you hear something please write).
>
> Adding support for the DTT file to affxparser would be "simple" but
> probably take considerable time since that part of the SDK depends on
> external libraries we would then have to figure out how to deploy
> etc. My first impression is that it is not worth the effort to do
> this (unless we are forced :)
>
> My final verdict will depend on my investigation, and - of course -
> whether there is a large interest in this...
Some investigation and an email to devnet at affy uncovered that the big
DTT bundle is in fact just a zip file containing EXP, CEL and CHP
files for each array. There seems to be an extra base file containing
the CDF file. DAT files are not included.
The DTT format I was referring to is an XML based format describing
some details about the experiment in a MAGEML (like) structure.
So I will not do anything about adding parsing capability - just use
your favorite zip program. It is not worth the time and effort to
parse DTT files using the SDK.
As a small curiosity, some of the DTT files for the 100k snp arrays
available on Affy's site are corrupted - look at the 500k instead.
/Kasper
> /Kasper
>
>> Cheers,
>> Jenny
>>
>> Jenny Drnevich, Ph.D.
>>
>> Functional Genomics Bioinformatics Specialist
>> W.M. Keck Center for Comparative and Functional Genomics
>> Roy J. Carver Biotechnology Center
>> University of Illinois, Urbana-Champaign
>>
>> 330 ERML
>> 1201 W. Gregory Dr.
>> Urbana, IL 61801
>> USA
>>
>> ph: 217-244-7355
>> fax: 217-265-5066
>> e-mail: drnevich at uiuc.edu
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/
>> gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/
> gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list