[BioC] Dist of exprSet object

Wolfgang Huber huber at ebi.ac.uk
Thu Jul 27 11:00:01 CEST 2006


Hi Marco,

1) have a look at "rowMeans"

2) have a look at the functions "cut" and "split"

 x = rnorm(100)
 ct = cut(rank(x), 10)
 sp = split(x, ct)
 boxplot(sp)


Cheers
Wolfgang

> Hum... This exemplified my hate-love relationship that I have with R... Very
> powerful, but very difficult to master...
> 
> One more issue. Each experiments are in duplicates (2 experiments, 2
> replicates -> 4 arrays). My goal is to partition the distribution in genes
> in the 10% top most expressed, 10% to 20% most expressed, 20% to 30% most
> expressed, and so on.
> 
> eset is my exprSet object containing the rma computed expression for each
> gene on the 4 arrays:
>> eset 
> Expression Set (exprSet) with
>         18952 genes
>         4 samples
>                  phenoData object with 1 variables and 4 cases
>          varLabels
>                 sample: arbitrary numbering
> 
> So I need to:
> 
> 1) Get the average expression for each gene from the 2 replicates
> Would you do:
>> exp1 = iter(eset[,1,2], , mean)
>> exp2 = iter(eset[,2,3], , mean)
> 
> Or is there a better way?
> 
> 2) Break down the distribution per 10% bin as in
>> top10 = geneNames(eset)[(rank(exp1) >= 0*(length(exp1)/10) & rank(exp1) <
> 1*(length(exp1)/10))]
>> top10_20 = geneNames(eset)[(rank(exp1) >= 1*(length(exp1)/10) & rank(exp1) <
> 2*(length(exp1)/10))]
> top20_30 = geneNames(eset)[(rank(exp1) >= 2*(length(exp1)/10) & rank(exp1) <
> 3*(length(exp1)/10))]
> 
> Or is there a better way? [I'm pretty sure there a more R elegant way than
> that...]
> 
> Many thanks folks
> 
> Cheers,
> 
> Marco
> 
> 
> On 7/26/06 4:05 PM, "Ben Bolstad" <bmb at bmbolstad.com> wrote:
> 
>> Actually you need affyPLM loaded to boxplot an exprSet. affy only
>> provides the method for AffyBatch objects. Otherwise your example is
>> correct.
>>
>> Best,
>>
>> Ben 
>>
>>
>> eg .....
>>
>>> library(affy)
>> Loading required package: Biobase
>> Loading required package: tools
>>
>> Welcome to Bioconductor
>>
>>
>>     Vignettes contain introductory material.
>>
>>     To view, simply type 'openVignette()' or start with 'help(Biobase)'.
>>
>>     For details on reading vignettes, see the openVignette help page.
>>
>>
>> Loading required package: affyio
>>> library(affydata)
>>> data(Dilution)
>>> eset <- rma(Dilution)
>> Background correcting
>> Normalizing
>> Calculating Expression
>>> boxplot(eset) # throws error
>> Error in boxplot.default(eset) : invalid first argument
>>> library(affyPLM)
>> Loading required package: gcrma
>> Loading required package: matchprobes
>>> boxplot(eset) #works fine.
>>
>>
>>
>>
>>
>>
>>
>> On Thu, 2006-07-27 at 10:58 +1200, Marcus Davy wrote:
>>> P 17 of the vignette("affy").
>>>
>>> e.g.
>>>
>>> chipCols <- rainbow(ncol(exprs(affybatch.example)))
>>> boxplot(affybatch.example, col=chipCols)
>>>
>>> Marcus
>>>
>>>
>>> On 7/27/06 10:40 AM, "Marco Blanchette" <mblanche at berkeley.edu> wrote:
>>>
>>>> Thank you all,
>>>>
>>>> Using bioclite to download the annotation fixed the problem.
>>>>
>>>> Now, I am getting into simpler R problem. I have an exprSet object  of 4
>>>> arrays:
>>>>> eset
>>>> Expression Set (exprSet) with
>>>>         18952 genes
>>>>         4 samples
>>>>                  phenoData object with 1 variables and 4 cases
>>>>          varLabels
>>>>                 sample: arbitrary numbering
>>>>
>>>> My goal is to draw a boxplot of the 4 different samples. Surely I can do:
>>>>> boxplot (exprs(eset)[,1], exprs(eset)[,2], exprs(eset)[,3],
>>>>> exprs(eset)[,4],
>>>> col=c(2,3,4,5))
>>>>
>>>> But is there an easier way to do with without having to subscript each
>>>> individual column? [right now I have only 4 but when I will have 20, I¹ll
>>>> get bored quite rapidly]
>>>>
>>>> Sorry if this sounds easy, I am still learning the basics of R
>>>>
>>>> Marco
>>>> ______________________________
>>>> Marco Blanchette, Ph.D.
>>>>
>>>> mblanche at uclink.berkeley.edu
>>>>
>>>> Donald C. Rio's lab
>>>> Department of Molecular and Cell Biology
>>>> 16 Barker Hall
>>>> University of California
>>>> Berkeley, CA 94720-3204
>>>>
>>>> Tel: (510) 642-1084
>>>> Cell: (510) 847-0996
>>>> Fax: (510) 642-6062
>>>
>>> ______________________________________________________
>>>
>>> The contents of this e-mail are privileged and/or confidenti...{{dropped}}
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> ______________________________
> Marco Blanchette, Ph.D.
> 
> mblanche at uclink.berkeley.edu
> 
> Donald C. Rio's lab
> Department of Molecular and Cell Biology
> 16 Barker Hall
> University of California
> Berkeley, CA 94720-3204
> 
> Tel: (510) 642-1084
> Cell: (510) 847-0996
> Fax: (510) 642-6062
> --
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber



More information about the Bioconductor mailing list