[BioC] Dist of exprSet object
Marco Blanchette
mblanche at berkeley.edu
Thu Jul 27 19:54:27 CEST 2006
Can't find any info on rawMeans:
> ?rawMeans
No documentation for 'rawMeans' in specified packages and libraries:
you could try 'help.search("rawMeans")'
On 7/27/06 2:00 AM, "Wolfgang Huber" <huber at ebi.ac.uk> wrote:
> Hi Marco,
>
> 1) have a look at "rowMeans"
>
> 2) have a look at the functions "cut" and "split"
>
> x = rnorm(100)
> ct = cut(rank(x), 10)
> sp = split(x, ct)
> boxplot(sp)
>
>
> Cheers
> Wolfgang
>
>> Hum... This exemplified my hate-love relationship that I have with R... Very
>> powerful, but very difficult to master...
>>
>> One more issue. Each experiments are in duplicates (2 experiments, 2
>> replicates -> 4 arrays). My goal is to partition the distribution in genes
>> in the 10% top most expressed, 10% to 20% most expressed, 20% to 30% most
>> expressed, and so on.
>>
>> eset is my exprSet object containing the rma computed expression for each
>> gene on the 4 arrays:
>>> eset
>> Expression Set (exprSet) with
>> 18952 genes
>> 4 samples
>> phenoData object with 1 variables and 4 cases
>> varLabels
>> sample: arbitrary numbering
>>
>> So I need to:
>>
>> 1) Get the average expression for each gene from the 2 replicates
>> Would you do:
>>> exp1 = iter(eset[,1,2], , mean)
>>> exp2 = iter(eset[,2,3], , mean)
>>
>> Or is there a better way?
>>
>> 2) Break down the distribution per 10% bin as in
>>> top10 = geneNames(eset)[(rank(exp1) >= 0*(length(exp1)/10) & rank(exp1) <
>> 1*(length(exp1)/10))]
>>> top10_20 = geneNames(eset)[(rank(exp1) >= 1*(length(exp1)/10) & rank(exp1) <
>> 2*(length(exp1)/10))]
>> top20_30 = geneNames(eset)[(rank(exp1) >= 2*(length(exp1)/10) & rank(exp1) <
>> 3*(length(exp1)/10))]
>>
>> Or is there a better way? [I'm pretty sure there a more R elegant way than
>> that...]
>>
>> Many thanks folks
>>
>> Cheers,
>>
>> Marco
>>
>>
>> On 7/26/06 4:05 PM, "Ben Bolstad" <bmb at bmbolstad.com> wrote:
>>
>>> Actually you need affyPLM loaded to boxplot an exprSet. affy only
>>> provides the method for AffyBatch objects. Otherwise your example is
>>> correct.
>>>
>>> Best,
>>>
>>> Ben
>>>
>>>
>>> eg .....
>>>
>>>> library(affy)
>>> Loading required package: Biobase
>>> Loading required package: tools
>>>
>>> Welcome to Bioconductor
>>>
>>>
>>> Vignettes contain introductory material.
>>>
>>> To view, simply type 'openVignette()' or start with 'help(Biobase)'.
>>>
>>> For details on reading vignettes, see the openVignette help page.
>>>
>>>
>>> Loading required package: affyio
>>>> library(affydata)
>>>> data(Dilution)
>>>> eset <- rma(Dilution)
>>> Background correcting
>>> Normalizing
>>> Calculating Expression
>>>> boxplot(eset) # throws error
>>> Error in boxplot.default(eset) : invalid first argument
>>>> library(affyPLM)
>>> Loading required package: gcrma
>>> Loading required package: matchprobes
>>>> boxplot(eset) #works fine.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, 2006-07-27 at 10:58 +1200, Marcus Davy wrote:
>>>> P 17 of the vignette("affy").
>>>>
>>>> e.g.
>>>>
>>>> chipCols <- rainbow(ncol(exprs(affybatch.example)))
>>>> boxplot(affybatch.example, col=chipCols)
>>>>
>>>> Marcus
>>>>
>>>>
>>>> On 7/27/06 10:40 AM, "Marco Blanchette" <mblanche at berkeley.edu> wrote:
>>>>
>>>>> Thank you all,
>>>>>
>>>>> Using bioclite to download the annotation fixed the problem.
>>>>>
>>>>> Now, I am getting into simpler R problem. I have an exprSet object of 4
>>>>> arrays:
>>>>>> eset
>>>>> Expression Set (exprSet) with
>>>>> 18952 genes
>>>>> 4 samples
>>>>> phenoData object with 1 variables and 4 cases
>>>>> varLabels
>>>>> sample: arbitrary numbering
>>>>>
>>>>> My goal is to draw a boxplot of the 4 different samples. Surely I can do:
>>>>>> boxplot (exprs(eset)[,1], exprs(eset)[,2], exprs(eset)[,3],
>>>>>> exprs(eset)[,4],
>>>>> col=c(2,3,4,5))
>>>>>
>>>>> But is there an easier way to do with without having to subscript each
>>>>> individual column? [right now I have only 4 but when I will have 20, I¹ll
>>>>> get bored quite rapidly]
>>>>>
>>>>> Sorry if this sounds easy, I am still learning the basics of R
>>>>>
>>>>> Marco
>>>>> ______________________________
>>>>> Marco Blanchette, Ph.D.
>>>>>
>>>>> mblanche at uclink.berkeley.edu
>>>>>
>>>>> Donald C. Rio's lab
>>>>> Department of Molecular and Cell Biology
>>>>> 16 Barker Hall
>>>>> University of California
>>>>> Berkeley, CA 94720-3204
>>>>>
>>>>> Tel: (510) 642-1084
>>>>> Cell: (510) 847-0996
>>>>> Fax: (510) 642-6062
>>>>
>>>> ______________________________________________________
>>>>
>>>> The contents of this e-mail are privileged and/or confidenti...{{dropped}}
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> ______________________________
>> Marco Blanchette, Ph.D.
>>
>> mblanche at uclink.berkeley.edu
>>
>> Donald C. Rio's lab
>> Department of Molecular and Cell Biology
>> 16 Barker Hall
>> University of California
>> Berkeley, CA 94720-3204
>>
>> Tel: (510) 642-1084
>> Cell: (510) 847-0996
>> Fax: (510) 642-6062
>> --
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
Marco Blanchette, Ph.D.
mblanche at berkeley.edu
Donald C. Rio's lab
Department of Molecular and Cell Biology
16 Barker Hall
University of California
Berkeley, CA 94720-3204
Tel: (510) 642-1084
Cell: (510) 847-0996
Fax: (510) 642-6062
More information about the Bioconductor
mailing list