[BioC] Difference between number of probes and number of data rows using 'oligo' on Affy miRNA v3.0 arrays
James W. MacDonald
jmacdon at uw.edu
Mon Feb 25 15:34:29 CET 2013
Hi Vicky,
On 2/24/2013 8:03 PM, Vicky Fan wrote:
> Dear all,
> I am using the 'oligo' package to process data from Affymetrix miRNA v3.0 arrays. When I extract the probe names as follows, I get 243982 probes:
>
>
>> library(oligo)
>> celFiles<- list.celfiles()
>> rawData<- read.celfiles(celFiles)nn
>> pNames<- probeNames(rawData)
>> exprs.rawData<- exprs(rawData)
>
>
> However, extracting the data itself gives me a different number of rows:
>
>
>
>> length(pNames)
> [1] 243982
>
>> dim(exprs.rawData)
> [1] 292681 6
>
> I’ve verified that this result occurs using the sample CEL files from the Affymetrix website here (although there is a login required):
>
> http://www.affymetrix.com/Auth/support/downloads/demo_data/mirna_3_sample_data.zip
>
> Shouldn’t the number of probes in the CEL file be the same as the number of rows in the dataset? I’m aware that the exprs function is for objects of type eSet and that read.celfiles returns an ExpressionFeatureSet object, not an eSet object, so maybe this has something to do with the non-matching numbers.
There are a large number of probes around the perimeter of the array (as
well as some blocks of probes in the middle) that are primarily used for
aligning the scanner to the array. Since these probes don't measure
anything of interest (it's oligo-dT), they are not used in any further
calculations.
The difference here is due to the fact that all probes are scanned by
the scanner, and those data are available in the celfile, so the
dimensions of the raw data will reflect the existence of these extra
probes. But since these probes aren't used for anything else, so when
you extract the probe names, those data only reflect the number of
probes on the array that are intended to measure various transcripts.
Best,
Jim
>
> Regards,
> Vicky
>
> --
> Vicky Fan
> Research Programmer
> Bioinformatics Institute
> School of Biological Sciences
> University of Auckland
> Ph: 09 373 7599 x 83777
>
> [[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list