[BioC] aggregate_summarizing expression values over entrez gene ids
James W. MacDonald
jmacdon at med.umich.edu
Thu Nov 13 14:37:11 CET 2008
Hi Vanessa,
Vanessa Vermeirssen wrote:
> Hi,
>
> I have a dataframe containing RMA normalized and summarized expression
> values for affymetrix probesets, av.data.
> I have looked up the Entrez gene ids for the probesets in the annotation
> package, entrezids.
> Multiple probesets map of course to the same entrez id and I would like
> to combine these data into one row,
> by averaging the expression values for the same entrez ids over the
> different experiments.
> I tried the function "aggregate" to do this, but somehow it gives an
> error that the arguments are not of the same length, but they are...???
> How can I solve this or is there any other way to do this?
>
> See my code below...
>
> av.data <- read.table("humanGPL570avdata.txt", row.names = 1, sep =
> "\t", header = T, na.strings = "NA", fill = T)
> av.data[1:5,1:5]
> X1_Schwann_p1 X1_Schwann_p3 X2_accumbens X2_adipose
> 1007_s_at 9.281857 9.340795 9.151775 8.319741
> 1053_at 7.000684 6.867318 4.633061 5.101534
> 117_at 6.007608 6.124562 5.425565 5.692270
> 121_at 6.543294 6.728119 7.651856 7.692947
> 1255_g_at 3.077289 2.989938 4.622865 2.955812
> X2_adipose_omental
> 1007_s_at 7.909480
> 1053_at 4.509407
> 117_at 6.298798
> 121_at 7.598834
> 1255_g_at 3.040816
>
> probes <- ls(hgu133plus2ENTREZID)
> entrezids <- unlist(mget(probes,hgu133plus2ENTREZID))
> newdata <- data.frame(entrezids,av.data)
>
> sum <- aggregate(av.data,as.list(entrezids),mean)
> Error in FUN(X[[1L]], ...) : arguments must have same length
The problem here is you need a list of vectors, each as long as
dim(av.data)[1]. What you have given is a list of vectors, each of
length one.
The difference is between list() and as.list(). If you use
list(entrezids), you will get a list of length one, containing a vector
of length 54675.
If you use as.list(entrezids) you get a list of length 54675, each item
containing one Entrez Gene ID.
Does this make sense?
Best,
Jim
>
> > length(as.list(entrezids))
> [1] 54675
> > dim(av.data)
> [1] 54675 69
>
> sumdata <- aggregate(newdata,as.list(newdata$entrezids),mean)
> Error in FUN(X[[1L]], ...) : arguments must have same length
> > length(as.list(newdata$entrezids))
> [1] 54675
> > dim(newdata)
> [1] 54675 70
>
>
> Thank you so much!
> Vanessa
>
--
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662
More information about the Bioconductor
mailing list