[BioC] subset in XPS

Zhibin Lu zhbluweb at hotmail.com
Wed Jul 2 22:52:40 CEST 2008


Dear Christian,

I tried both methods and both of them worked well!

Maybe you have known this problem. When I loaded CEL files under Mac OS 10.5/R 2.7.1/BioC 2.2 with the command
Data=import.data(scheme, "Data", celdir=".", celfiles=files)
it was very very slow and I also got a warning message "(WARNING: partial output only, ask package author to use Rprintf instead!)".
But it was fine when I ran the same command under linux.

RMA normalization costs lots of time. I know I can save the result using save.image() and use load() to continue the work next time, but just for curiosity, is there a way to load ExprTreeSet from root file just like load SchemeTreeSet and DataTreeSet?

Thanks so much for your help,

Zhibin

> Date: Mon, 30 Jun 2008 21:29:23 +0200
> From: cstrato at aon.at
> To: zhbluweb at hotmail.com
> CC: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] subset in XPS
>
> Dear Zhibin
>
> Meanwhile, I have uploaded a new version to BioC devel:
> http://bioconductor.org/packages/2.3/bioc/html/xps.html
> which simplifies your request as follows:
>
> 1. get expression values
>> value <- exprs(data.rma)
> 2. select treenames of choice (no extension necessary)
>> treenames <- c("TestA2", "TestB1")
> 3. make a copy of your object if you do not want to replace it
>> sub.rma <- data.rma
> 4. replace slot data with subset
> exprs(sub.rma, treenames) <- value
> 5. check if the new ExprTreeSet is correct:
>> str(sub.rma)
>
> Best regards
> Christian
>
>
> Zhibin Lu wrote:
>> Dear Christian,
>>
>> Thanks so much for such a detailed explanation. I will try this when I
>> get to work next week, and I do not see why I can not follow the
>> direction.
>>
>> Thanks again and have a nice weekend,
>>
>> Zhibin
>>
>>> Date: Sat, 28 Jun 2008 15:46:26 +0200
>>> From: cstrato at aon.at
>>> To: zhbluweb at hotmail.com
>>> CC: bioconductor at stat.math.ethz.ch
>>> Subject: Re: [BioC] subset in XPS
>>>
>>> Dear Zhibin
>>>
>>> Since you have already done RMA you have now an ExprTreeSet,
>>> called e.g. "data.rma". You can see the structure with:
>>>> str(data.rma)
>>>
>>> Since currently there is no direct possibility to use a
>>> subset of type ExprTreeSet only, you can create a new class
>>> ExprTreeSet in the following way:
>>>
>>> 1. Make a subset of slot "data" which is a dataframe
>>> (assuming that you want to use samples 1,2,3,7,8,9):
>>>> subdata <- exprs(data.rma)
>>>> subdata <- subdata[,c(1:2,3:5, 9:11)]
>>> Please note that it is important to keep the first
>>> two columns.
>>>
>>> 2. Create a copy "sub.rma" of class "data.rma"
>>>> sub.rma <- data.rma
>>>
>>> 3. Replace slot "data" with "subdata":
>>>> exprs(sub.rma) <- subdata
>>>
>>> For the moment you need to replace slots "treenames" and
>>> "numtrees", too, which I will change in the future to be
>>> done automatically.
>>>
>>> 4. Replace slot "treenames" with the names of your subset:
>>> a, create list containing the sub samples
>>>> subtrees <- unlist(treeNames(data.g.rma))
>>>> subtrees <- as.list(subtrees[c(1:3,7:9)])
>>> b, check if the names are correct:
>>>> subtrees
>>> c, replace slot "treenames":
>>>> sub.rma at treenames <- subtrees
>>>
>>> 5. Replace slot "numtrees" with the number of subsamples
>>>> sub.rma at numtrees <- length(subtrees)
>>>
>>> 6. Check if the new ExprTreeSet is correct:
>>>> str(sub.rma)
>>>
>>> Now you can use the new ExprTreeSet "sub.rma" as input for
>>> method unifilter:
>>>> rma.ufr <- unifilter(sub.rma, .......)
>>>
>>>
>>> If you want to take advantage of the advanced capabilties
>>> of package "limma", then you can create a Biobase class
>>> "ExpressionSet" containing only your 6 samples as described
>>> in Appendix A.3 of the vignette xps.pdf:
>>>
>>> 1. extract the normalized expression data:
>>>> subdata <- validData(data.rma)
>>>
>>> 2. Since "subdata" is a dataframe, simply create a subframe:
>>>> subdata <- subdata[,c(1:3,7:9)]
>>>
>>> 3. Create a Biobase class "ExpressionSet", called "subset"
>>>> subset <- new("ExpressionSet", exprs = as.matrix(subdata))
>>>
>>> Now you have an ExpressionSet ready for use with "limma".
>>>
>>> Please let me know if you succeeded with this info.
>>>
>>> Best regards
>>> Christian
>>> _._._._._._._._._._._._._._._._
>>> C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
>>> V.i.e.n.n.a A.u.s.t.r.i.a
>>> e.m.a.i.l: cstrato at aon.at
>>> _._._._._._._._._._._._._._._._
>>>
>>> Zhibin Lu wrote:
>>>> Hi,
>>>>
>>>> I am new in R/bioconductor. I am using xps package to analyze
>> Affymetrix Gene ST 1.0 data. After I loaded CEL files into the
>> DataTreeSet and compute the expression level with RMA, can I work on a
>> subset of the data? Say, I have 12 samples. After RMA, can I just work
>> on 6 of them and divide them into two groups and apply UniFilter to
>> just these 6 ones?
>>>>
>>>> Thanks,
>>>>
>>>> Zhibin
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>>
>>>>
>>>
>>
>> ------------------------------------------------------------------------
>

_________________________________________________________________
Try Chicktionary, a game that tests how many words you can form from the le[[elided Hotmail spam]]



More information about the Bioconductor mailing list