On Fri, Jun 11, 2010 at 8:12 AM, Hernando Martínez <hernybiotec@gmail.com>wrote:

> Hello everyone, my name is Hernando, and I am new to R. I have a little
> problem that maybe you can help me with, as I have been looking through the
> packages with no success, and it shouldn't be very difficult to solve.
> I have a text file containing a list of genes, with expression values for
> each along a set of microarray experiments. Ex:
>
> geneID     sample1      sample 2   ....
>
> gene1      45               58        ....
>
> gene1       43              63      .....
>
> gene2      32              21         ....
>
> ......        .....           ......        .....
>
> In this list, there are some genes repeated, but with different values
> (like
> in the example). This repetitions come from different probes targeting the
> same gene.
> What I want is a new text file, but with each gene appearing only once, and
> with three possibilities for the expression values of repeated genes:
>
> - Each value (for each column (sample)) is the average of the previous
> values (in the example, sample 1 for gene1 should be 44, and 60,5 in sample
> 2)
> - Instead of the average, the median.
> - The highest values.
>
> I would prefer the median or the average, but I don't know if getting the
> highest values is easier.
>
> I have seen this function: "findLargest" of "genefilter" package, but it
> works with probes and I have already converted files (to geneIDs).
>
>
Hi, Hernando.  Have a look at the aggregate() function.

Sean

	[[alternative HTML version deleted]]