[BioC] subset in XPS

Thu Jul 3 15:03:23 CEST 2008

Dear Christian,

You were right! When I ran R through Terminal under Mac OS 10.5, there was no error at all to load CEL files, and the loading process was quite fast. 
Thanks,

Zhibin

> Date: Thu, 3 Jul 2008 00:16:23 +0200
> From: cstrato at aon.at
> To: zhbluweb at hotmail.com
> CC: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] subset in XPS
>
> Dear Zhibin
>
> It is good to know that both methods worked for you.
>
> Regarding your problem with MacOS 10.5 I assume that you are using R.app?
>
> Please note that I do all my development on a MacBook Pro using MacOS
> 10.4.8 and currently R-2.7.1, and command "data<-import.data()" is as
> fast as on Linux w/o any output problems. However, I never use R.app but
> always start R from an xterm!
>
> I have just tested R.app and do not see a slowdown, however, I get some
> strange error messages. Maybe there are even more problems with R.app
> on MacOS 10.5, which I currently do not have.
>
> Since most of the output is from the C++ code, which can be used
> independently of R, I am not able to use Rprintf. I have tested my
> package on MacOS X, Linux and Winodws XP, and if you use the command
> line, everything works fine on all three machines.
>
> I would appreciate if you could try to run your data on your Mac using
> either Apple's Terminal or xterm (for xterm you need to install X11
> from the system CD first) and let me know if you still experience a
> slowdown.
>
> Regarding your second question: Since you can use save.image() I did not
> yet implement to load an ExprTreeSet, however, this is on my to-do list.
>
> Best regards
> Christian
>
>
> Zhibin Lu wrote:
>> Dear Christian,
>>
>> I tried both methods and both of them worked well!
>>
>> Maybe you have known this problem. When I loaded CEL files under Mac OS 10.5/R 2.7.1/BioC 2.2 with the command
>> Data=import.data(scheme, "Data", celdir=".", celfiles=files)
>> it was very very slow and I also got a warning message "(WARNING: partial output only, ask package author to use Rprintf instead!)".
>> But it was fine when I ran the same command under linux.
>>
>> RMA normalization costs lots of time. I know I can save the result using save.image() and use load() to continue the work next time, but just for curiosity, is there a way to load ExprTreeSet from root file just like load SchemeTreeSet and DataTreeSet?
>>
>> Thanks so much for your help,
>>
>> Zhibin
>>
>>
>>> Date: Mon, 30 Jun 2008 21:29:23 +0200
>>> From: cstrato at aon.at
>>> To: zhbluweb at hotmail.com
>>> CC: bioconductor at stat.math.ethz.ch
>>> Subject: Re: [BioC] subset in XPS
>>>
>>> Dear Zhibin
>>>
>>> Meanwhile, I have uploaded a new version to BioC devel:
>>> http://bioconductor.org/packages/2.3/bioc/html/xps.html
>>> which simplifies your request as follows:
>>>
>>> 1. get expression values
>>>
>>>> value <- exprs(data.rma)
>>>>
>>> 2. select treenames of choice (no extension necessary)
>>>
>>>> treenames <- c("TestA2", "TestB1")
>>>>
>>> 3. make a copy of your object if you do not want to replace it
>>>
>>>> sub.rma <- data.rma
>>>>
>>> 4. replace slot data with subset
>>> exprs(sub.rma, treenames) <- value
>>> 5. check if the new ExprTreeSet is correct:
>>>
>>>> str(sub.rma)
>>>>
>>> Best regards
>>> Christian
>>>
>>>
>>> Zhibin Lu wrote:
>>>
>>>> Dear Christian,
>>>>
>>>> Thanks so much for such a detailed explanation. I will try this when I
>>>> get to work next week, and I do not see why I can not follow the
>>>> direction.
>>>>
>>>> Thanks again and have a nice weekend,
>>>>
>>>> Zhibin
>>>>
>>>>
>>>>> Date: Sat, 28 Jun 2008 15:46:26 +0200
>>>>> From: cstrato at aon.at
>>>>> To: zhbluweb at hotmail.com
>>>>> CC: bioconductor at stat.math.ethz.ch
>>>>> Subject: Re: [BioC] subset in XPS
>>>>>
>>>>> Dear Zhibin
>>>>>
>>>>> Since you have already done RMA you have now an ExprTreeSet,
>>>>> called e.g. "data.rma". You can see the structure with:
>>>>>
>>>>>> str(data.rma)
>>>>>>
>>>>> Since currently there is no direct possibility to use a
>>>>> subset of type ExprTreeSet only, you can create a new class
>>>>> ExprTreeSet in the following way:
>>>>>
>>>>> 1. Make a subset of slot "data" which is a dataframe
>>>>> (assuming that you want to use samples 1,2,3,7,8,9):
>>>>>
>>>>>> subdata <- exprs(data.rma)
>>>>>> subdata <- subdata[,c(1:2,3:5, 9:11)]
>>>>>>
>>>>> Please note that it is important to keep the first
>>>>> two columns.
>>>>>
>>>>> 2. Create a copy "sub.rma" of class "data.rma"
>>>>>
>>>>>> sub.rma <- data.rma
>>>>>>
>>>>> 3. Replace slot "data" with "subdata":
>>>>>
>>>>>> exprs(sub.rma) <- subdata
>>>>>>
>>>>> For the moment you need to replace slots "treenames" and
>>>>> "numtrees", too, which I will change in the future to be
>>>>> done automatically.
>>>>>
>>>>> 4. Replace slot "treenames" with the names of your subset:
>>>>> a, create list containing the sub samples
>>>>>
>>>>>> subtrees <- unlist(treeNames(data.g.rma))
>>>>>> subtrees <- as.list(subtrees[c(1:3,7:9)])
>>>>>>
>>>>> b, check if the names are correct:
>>>>>
>>>>>> subtrees
>>>>>>
>>>>> c, replace slot "treenames":
>>>>>
>>>>>> sub.rma at treenames <- subtrees
>>>>>>
>>>>> 5. Replace slot "numtrees" with the number of subsamples
>>>>>
>>>>>> sub.rma at numtrees <- length(subtrees)
>>>>>>
>>>>> 6. Check if the new ExprTreeSet is correct:
>>>>>
>>>>>> str(sub.rma)
>>>>>>
>>>>> Now you can use the new ExprTreeSet "sub.rma" as input for
>>>>> method unifilter:
>>>>>
>>>>>> rma.ufr <- unifilter(sub.rma, .......)
>>>>>>
>>>>> If you want to take advantage of the advanced capabilties
>>>>> of package "limma", then you can create a Biobase class
>>>>> "ExpressionSet" containing only your 6 samples as described
>>>>> in Appendix A.3 of the vignette xps.pdf:
>>>>>
>>>>> 1. extract the normalized expression data:
>>>>>
>>>>>> subdata <- validData(data.rma)
>>>>>>
>>>>> 2. Since "subdata" is a dataframe, simply create a subframe:
>>>>>
>>>>>> subdata <- subdata[,c(1:3,7:9)]
>>>>>>
>>>>> 3. Create a Biobase class "ExpressionSet", called "subset"
>>>>>
>>>>>> subset <- new("ExpressionSet", exprs = as.matrix(subdata))
>>>>>>
>>>>> Now you have an ExpressionSet ready for use with "limma".
>>>>>
>>>>> Please let me know if you succeeded with this info.
>>>>>
>>>>> Best regards
>>>>> Christian
>>>>> _._._._._._._._._._._._._._._._
>>>>> C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
>>>>> V.i.e.n.n.a A.u.s.t.r.i.a
>>>>> e.m.a.i.l: cstrato at aon.at
>>>>> _._._._._._._._._._._._._._._._
>>>>>
>>>>> Zhibin Lu wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am new in R/bioconductor. I am using xps package to analyze
>>>>>>
>>>> Affymetrix Gene ST 1.0 data. After I loaded CEL files into the
>>>> DataTreeSet and compute the expression level with RMA, can I work on a
>>>> subset of the data? Say, I have 12 samples. After RMA, can I just work
>>>> on 6 of them and divide them into two groups and apply UniFilter to
>>>> just these 6 ones?
>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Zhibin
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioconductor mailing list
>>>>>> Bioconductor at stat.math.ethz.ch
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>> Search the archives:
>>>>>>
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>>>>
>>>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>
>> _________________________________________________________________
>> Try Chicktionary, a game that tests how many words you can form from the letters given. Find this and more puzzles at Live Search Games!
>> http://g.msn.ca/ca55/207
>>
>>
>

_________________________________________________________________
If you like crossword puzzles, then you'll love Flexicon, a game which comb[[elided Hotmail spam]]