[BioC] Dist of exprSet object

Marco Blanchette mblanche at berkeley.edu
Thu Jul 27 19:54:27 CEST 2006


Can't find any info on rawMeans:

> ?rawMeans
No documentation for 'rawMeans' in specified packages and libraries:
you could try 'help.search("rawMeans")'


On 7/27/06 2:00 AM, "Wolfgang Huber" <huber at ebi.ac.uk> wrote:

> Hi Marco,
> 
> 1) have a look at "rowMeans"
> 
> 2) have a look at the functions "cut" and "split"
> 
>  x = rnorm(100)
>  ct = cut(rank(x), 10)
>  sp = split(x, ct)
>  boxplot(sp)
> 
> 
> Cheers
> Wolfgang
> 
>> Hum... This exemplified my hate-love relationship that I have with R... Very
>> powerful, but very difficult to master...
>> 
>> One more issue. Each experiments are in duplicates (2 experiments, 2
>> replicates -> 4 arrays). My goal is to partition the distribution in genes
>> in the 10% top most expressed, 10% to 20% most expressed, 20% to 30% most
>> expressed, and so on.
>> 
>> eset is my exprSet object containing the rma computed expression for each
>> gene on the 4 arrays:
>>> eset 
>> Expression Set (exprSet) with
>>         18952 genes
>>         4 samples
>>                  phenoData object with 1 variables and 4 cases
>>          varLabels
>>                 sample: arbitrary numbering
>> 
>> So I need to:
>> 
>> 1) Get the average expression for each gene from the 2 replicates
>> Would you do:
>>> exp1 = iter(eset[,1,2], , mean)
>>> exp2 = iter(eset[,2,3], , mean)
>> 
>> Or is there a better way?
>> 
>> 2) Break down the distribution per 10% bin as in
>>> top10 = geneNames(eset)[(rank(exp1) >= 0*(length(exp1)/10) & rank(exp1) <
>> 1*(length(exp1)/10))]
>>> top10_20 = geneNames(eset)[(rank(exp1) >= 1*(length(exp1)/10) & rank(exp1) <
>> 2*(length(exp1)/10))]
>> top20_30 = geneNames(eset)[(rank(exp1) >= 2*(length(exp1)/10) & rank(exp1) <
>> 3*(length(exp1)/10))]
>> 
>> Or is there a better way? [I'm pretty sure there a more R elegant way than
>> that...]
>> 
>> Many thanks folks
>> 
>> Cheers,
>> 
>> Marco
>> 
>> 
>> On 7/26/06 4:05 PM, "Ben Bolstad" <bmb at bmbolstad.com> wrote:
>> 
>>> Actually you need affyPLM loaded to boxplot an exprSet. affy only
>>> provides the method for AffyBatch objects. Otherwise your example is
>>> correct.
>>> 
>>> Best,
>>> 
>>> Ben 
>>> 
>>> 
>>> eg .....
>>> 
>>>> library(affy)
>>> Loading required package: Biobase
>>> Loading required package: tools
>>> 
>>> Welcome to Bioconductor
>>> 
>>> 
>>>     Vignettes contain introductory material.
>>> 
>>>     To view, simply type 'openVignette()' or start with 'help(Biobase)'.
>>> 
>>>     For details on reading vignettes, see the openVignette help page.
>>> 
>>> 
>>> Loading required package: affyio
>>>> library(affydata)
>>>> data(Dilution)
>>>> eset <- rma(Dilution)
>>> Background correcting
>>> Normalizing
>>> Calculating Expression
>>>> boxplot(eset) # throws error
>>> Error in boxplot.default(eset) : invalid first argument
>>>> library(affyPLM)
>>> Loading required package: gcrma
>>> Loading required package: matchprobes
>>>> boxplot(eset) #works fine.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Thu, 2006-07-27 at 10:58 +1200, Marcus Davy wrote:
>>>> P 17 of the vignette("affy").
>>>> 
>>>> e.g.
>>>> 
>>>> chipCols <- rainbow(ncol(exprs(affybatch.example)))
>>>> boxplot(affybatch.example, col=chipCols)
>>>> 
>>>> Marcus
>>>> 
>>>> 
>>>> On 7/27/06 10:40 AM, "Marco Blanchette" <mblanche at berkeley.edu> wrote:
>>>> 
>>>>> Thank you all,
>>>>> 
>>>>> Using bioclite to download the annotation fixed the problem.
>>>>> 
>>>>> Now, I am getting into simpler R problem. I have an exprSet object  of 4
>>>>> arrays:
>>>>>> eset
>>>>> Expression Set (exprSet) with
>>>>>         18952 genes
>>>>>         4 samples
>>>>>                  phenoData object with 1 variables and 4 cases
>>>>>          varLabels
>>>>>                 sample: arbitrary numbering
>>>>> 
>>>>> My goal is to draw a boxplot of the 4 different samples. Surely I can do:
>>>>>> boxplot (exprs(eset)[,1], exprs(eset)[,2], exprs(eset)[,3],
>>>>>> exprs(eset)[,4],
>>>>> col=c(2,3,4,5))
>>>>> 
>>>>> But is there an easier way to do with without having to subscript each
>>>>> individual column? [right now I have only 4 but when I will have 20, I¹ll
>>>>> get bored quite rapidly]
>>>>> 
>>>>> Sorry if this sounds easy, I am still learning the basics of R
>>>>> 
>>>>> Marco
>>>>> ______________________________
>>>>> Marco Blanchette, Ph.D.
>>>>> 
>>>>> mblanche at uclink.berkeley.edu
>>>>> 
>>>>> Donald C. Rio's lab
>>>>> Department of Molecular and Cell Biology
>>>>> 16 Barker Hall
>>>>> University of California
>>>>> Berkeley, CA 94720-3204
>>>>> 
>>>>> Tel: (510) 642-1084
>>>>> Cell: (510) 847-0996
>>>>> Fax: (510) 642-6062
>>>> 
>>>> ______________________________________________________
>>>> 
>>>> The contents of this e-mail are privileged and/or confidenti...{{dropped}}
>>>> 
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> 
>> ______________________________
>> Marco Blanchette, Ph.D.
>> 
>> mblanche at uclink.berkeley.edu
>> 
>> Donald C. Rio's lab
>> Department of Molecular and Cell Biology
>> 16 Barker Hall
>> University of California
>> Berkeley, CA 94720-3204
>> 
>> Tel: (510) 642-1084
>> Cell: (510) 847-0996
>> Fax: (510) 642-6062
>> --
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 


Marco Blanchette, Ph.D.

mblanche at berkeley.edu

Donald C. Rio's lab
Department of Molecular and Cell Biology
16 Barker Hall
University of California
Berkeley, CA 94720-3204

Tel: (510) 642-1084
Cell: (510) 847-0996
Fax: (510) 642-6062



More information about the Bioconductor mailing list