[BioC] Integrating Codelink data with bioconductor (using affyand
Diego Díez Ruiz
ddiez at iib.uam.es
Mon Apr 25 15:15:22 CEST 2005
Gordon Smyth escribió:
> At 09:35 PM 25/04/2005, Diego Díez Ruiz wrote:
>> Dear Gordon,
>> Thanks for your response. I will use the data as early but, What do
>> you think it could affect more to normalization process: Some points
>> assigned as NA values or some point with lowers A values as one of the
>> intensitues was assigned a value of say 0.01?
> Unless you're doing much more than I think you are, you must avoid NAs
> at all costs. If you have to live with low intensities, then so be it.
then so be it.
>> I'd let you see my class definition and parser of course. This is
>> really the first time a make use of classes and store all things as an
>> R package so I thought that the best way to make something usable and
>> quick without having to read completly "writting R extensions" was
>> using others packages to learn (that is one of the greatness of
>> opensource :). Of course I will have to read it one day.
>> 1. The parser read exported txt files from codelink software.
> I've never seen Codelink output, but my understanding is that it is
> essentially just ImaGene output. Is that not correct?
I've never seen Imagene output. This is header and column names from
CodeLink Expression Analysis 184.108.40.206054
CNIC Report for Slide (T00241792)
PRODUCT Human Whole Genome
Sample Name Array 1 Sample001
Median Array 1 86,6547470092773
Report( 1 ): 310105-Person
Idx Array Sample_name Probe_name Annotation_PIN Annotation_NCBI_Acc
Annotation_NCBI_NID Annotation_LocusLink Annotation_OGS
Annotation_UniGene Annotation_ENSEMBL Probe_type Feature_id
Raw_intensity Normalized_intensity Quality_flag Signal_strength
Logical_row Logical_col Center_X Center_Y Spot_mean Spot_median
Spot_stdev Spot_area Spot_diameter Spot_noise_level Bkgd_mean
Bkgd_median Bkgd_stdev Bkgd_area Annotation_Molecular_Function
Annotation_Cytoband Annotation_HS_Homology Annotation_MM_Homology
Header could be less than 10 rows (custom) and columns could be
customized (for example in my own data I avoid Annotation_* and
Description columns). I'm not sure if in this example there are all the
>> It works fine with 3 different chips so I think it should work fine
>> with all types. A problem is that exported text data have custom
>> fields (and you can chose within all fields including Raw_intensity,
>> Median_foreground, etc) So it could be possible to found files with
>> missing fields not exported. I know that it is possible to export as
>> XML but a didn't try that yet.
>> 2. The class definition is very simple. I based it in RGlist and used
>> almost all redefinitions of dim() as.matrix() etc... that you use in
>> limma. I also based a subsetting system in the one used in AffyBatch
>> objects in affy. A Codelink object stores as a list 3 matrices. One of
>> intensities, one of Flags and one last with probe name and probe type.
>> I actually named it "val" "flags" and "info" slots but i don't thing
>> they are appropiate so this week I want to import all possible fields
>> and name it as they are called in the exported files. I probably too
>> make comprobation about the fields present and warn or error if a
>> *must have* field is missing.
>> When I have a more clear and clean code I will not have any problems
>> in let you see it.
More information about the Bioconductor