On Wed, Jul 22, 2009 at 3:00 PM, Alison Waller <alison.waller@utoronto.ca>wrote:

> Dear Bioconductor list,
>
> I am analysing data from  a custom Agilent array with 3600 spots using
> Limma.
>
> There are 3 probes for each gene (usually, however some genes only have one
> probe), all probes are in duplicate.
>
> I would like to obtain an average M value for each gene.
>
> Examples of the spot ID's are as below.
> D137-cbdb_A1587_1
> D137-cbdb_A1587_1
> D137-cbdb_A1587_2
> D137-cbdb_A1587_2
> D137-cbdb_A1587_3
> D137-cbdb_A1587_3
> D138-cbdb_A1594
> D138-cbdb_A1594
>
>
> One option I thought of was to adjust the GAL file to have identical IDs
> for all of the probes for the same gene and then use the avereps() function.
>
> ID      Name
>
> D137    D137-cbdb_A1587_1
> D137    D137-cbdb_A1587_1
> D137    D137-cbdb_A1587_2
> D137    D137-cbdb_A1587_2
> D137    D137-cbdb_A1587_3
> D137    D137-cbdb_A1587_3
> D138    D138-cbdb_A1594
> D138    D138-cbdb_A1594
>
> However, the avereps() function seems more suitable for actual duplicates,
> for probesets I would like to use some weighted average where probes with
> intensities which are futher from the mean of the probe set are down
> weighted (for example the tukey biweight).
>
> Does anyone have experience with similar arrays or suggestions of an
> appropriate function.
>

While this sounds like a good idea, it has some significant disadvantages
over keeping the probes separate.  I think most folks would suggest that you
do your analyses at the probe level, as each probe is measuring the same
thing.  So, I would suggest summarizing only to the level of the probe and
not to the level of the gene.

Sean

	[[alternative HTML version deleted]]

