[BioC] Limma using all probes on Affy chip

Mon Aug 8 05:37:22 CEST 2005

Hello,

     I'm sorry if this question went to the wrong place, but I would be ecstatic
if someone had the answer to it.  I've been through most of the bioconductor
site and online workshops and have read the vignettes and function help pages
for all the relevant packages, but have been unable to find out how to use all
the probe cells on affy chips in differential expression analysis packages in
bioconductor.  My data contains two groups with only two replicates each (and I
am temporarily unable to come up with funds for the last two arrays), so it
would be very beneficial for me to be able to use all 11 (or 16, etc.) probes
in each array in order to get p-values, rather then just having two expression
values to use in each group.  Affymetrix's GCOS has the ability to perform
Wilcoxon on these values for each probe set, but 1. I want to also perform
quantile normalization on my data, and 2. I want to use empirical Bayes to
determine the significance of differential expression calls, and GCOS can not
do those.  Also, GCOS can only compare two chips at a time.  Here is what I
know:
      The Limma package needs the data to be in an object of the class "exprset"
with the samples as columns and the genes as rows.  If I use the affy package
and perform bg.correct() and normalization.AffyBatch.quantiles(), I can get my
processed probe values.  But now my probe values are lined up in columns, and I
want to treat each probe in a probeset as a separate sample (so that I can get
a p-value out of them), which means that I need the corresponding probes for
each gene in one row, not 11 different rows.  Since affy can find all the
probes for a probeset (evidenced by both pmindex() and the fact that it can
yield expression values), there should be a way to make matrices for all the
arrays with the probe sets (instead of expression values) as rows that I could
plug into limma.  I'm aware that some probe sets have more probes than others,
but would it work to fill in extra values as "NA" so that the matrix is "full"?
 And can anyone point me in the right direction as to how to create this
matrix.  Finally, where is the phenoData file that I would need to turn the
matrix back into an exprset after it goes through limma? (answering the
previous question will probably answer this as well).  Sorry for the long
windedness, and I would really be thankful for any help anyone could give.

Thank You,

David Young (from UNR)