[BioC] xps: root.profile on a asubset of the data

cstrato cstrato at aon.at
Tue Aug 24 23:19:58 CEST 2010


Dear all,

As Daniel has suggested in a private mail, I will clean up the file 
names like read.table(), using function make.names(). However, this 
means that names starting with numbers will start with X instead, just 
like in read.table(). Furthermore, I will replace dots with underscores. 
For this reason I have uploaded the new version only to the development 
version "xsp_1.9.5".

Best regards
Christian


On 8/23/10 8:00 PM, cstrato wrote:
> Dear Daniel,
>
> The reason for the error messages you get is that the names of your
> CEL-files contain parenthesis which cause the ROOT C++ class
> TParallelCoord to interpret the names as formulas, since TParallelCoord
> allows you to pass an expression.
>
> In my C++ code I am even taking advantage of this possibility and pass
> "varexpr=log(celname)" if parameter "as.log=TRUE".
>
> I am afraid that the only possibility to solve this problem is to change
> the names of the CEL-files when importing them as trees into the ROOT
> file, however you do NOT need to change the names of the original
> CEL-files.
>
> Since I know that many CEL-files contain strange names with many
> problematic characters (e.g.
> "42A#0214(12);06/23/99;MES-SA/Dx-batch#21089.CEL") I have implemented
> the possibility to use more informative names as aliases when importing
> the CEL-files.
>
> For example to confirm the error messages you get I have renamed two of
> the four test CEL-files (TestA1.CEL, TestA2.CEL, TestB1.CEL, TestB2.CEL)
> as follows:
>
> # first, import ROOT scheme file
>  > scheme.test3 <-
> root.scheme(paste(.path.package("xps"),"schemes/SchemeTest3.root",sep="/"))
>
> # import CEL files with new names
>  > celdir <- "/Volumes/CoreData/ROOT/rootdata/testAB/raw"
>  > celnames <- c("TestA1", "TestA2", "0309_CoC(3)41_ExH_PRC133",
> "0309_CoC(3)42_ExH_PRC134")
>  > data.test3 <- import.data(scheme.test3, "tmp_Test3", celdir=celdir,
> celnames=celnames)
>
> # this profile plot is ok
>  > root.profile(data.test3, treename=c("TestA1.cel", "TestA2.cel"))
>
> # this profile plot reproduces the error messages you get
>  > root.profile(data.test3, treename=c("0309_CoC(3)41_ExH_PRC133.cel",
> "0309_CoC(3)42_ExH_PRC134.cel"))
>
> # you can always extract the names of the original CEL-files
>  > rawCELName(data.test3, fullpath=FALSE)
> [1] "TestA1.CEL" "TestA2.CEL" "TestB1.CEL" "TestB2.CEL"
>
> I hope this helps you to understand the reason of the problem, and that
> my only advice can be to change the name of the CEL-files during import
> (it is not possible to change the tree names once they are imported into
> a ROOT file).
>
> Best regards
> Christian
>
>
> On 8/23/10 11:32 AM, Daniel Brewer wrote:
>> Dear Christian,
>>
>> Many thanks for making some code changes, thats great. Unfortunately I
>> have tried it and it doesn't seem to work.
>>
>> I create a list of the tree nodes:
>>> samples<- unlist(treeNames(rootData))
>>
>> Then check to see if I can use the root graphics:
>>> root.image(rootData,treename = samples[1])
>>> root.profile(rootData,treename=samples[1])
>>
>> But when I try to do a root.profile with more in it I get an error:
>>> root.profile(rootData,treename=samples[1:2])
>> root [0]
>> Processing
>> /Users/dbrewer/Library/R/2.11/library/xps/rootsrc/macroDrawProfilePlot.C("/Users/dbrewer/Library/R/2.11/library/xps/libs/i386/xps.so","/Volumes/Datastore/ProstateCancerMap/QCFINAL/cancermapQC_cel.root","ProfilePlot","DataSet","0309_CoC(3)41_ExH_PRC133:0309_CoC(3)42_ExH_PRC134","cel","fInten","",0,0,1,1,1,800,600)...
>>
>> Warning in<TParallelCoord::TParallelCoord>: Call
>> tree->SetEstimate(tree->GetEntries()) to display all the tree variables
>> Error in<TTreeFormula::AnalyzeFunction>: We thought we had a function
>> but we dont (in 0309_CoC(3)41_ExH_PRC133.fInten)
>>
>> Error in<TTreeFormula::Compile>: Bad numerical expression :
>> "0309_CoC(3)41_ExH_PRC133.fInten"
>> Warning in<TParallelCoord::AddVariable>:
>> log(0309_CoC(3)41_ExH_PRC133.fInten) could not be evaluated
>> Error in<TTreeFormula::AnalyzeFunction>: We thought we had a function
>> but we dont (in 0309_CoC(3)42_ExH_PRC134.cel.fInten)
>>
>> Error in<TTreeFormula::Compile>: Bad numerical expression :
>> "0309_CoC(3)42_ExH_PRC134.cel.fInten"
>> Warning in<TParallelCoord::AddVariable>:
>> log(0309_CoC(3)42_ExH_PRC134.cel.fInten) could not be evaluated
>>
>> *** Break *** bus error
>> /Volumes/Datastore/ProstateCancerMap/QCFINAL/2278: No such file or
>> directory.
>> Attaching to process 2278.
>> Reading symbols for shared libraries . done
>> Reading symbols for shared libraries
>> .......................................... done
>> 0x94bc7189 in wait4 ()
>>
>> ========== STACKS OF ALL THREADS ==========
>>
>> Thread 1 (process 2278 thread 0x10b):
>> #0 0x94bc7189 in wait4 ()
>> #1 0x94bc4cd4 in system$UNIX2003 ()
>> #2 0x00906141 in TUnixSystem::StackTrace ()
>> #3 0x00909ac5 in TUnixSystem::DispatchSignals ()
>> #4 0x00909c38 in SigHandler ()
>> #5<signal handler called>
>> #6 0x048bd989 in TParallelCoordEditor::SetModel ()
>> #7 0x044ecf41 in TGedEditor::ConfigureGedFrames ()
>> #8 0x044eda3a in TGedEditor::SetModel ()
>> #9 0x04556eee in G__G__Ged_221_0_28 ()
>> #10 0x01077e27 in Cint::G__CallFunc::Execute ()
>> #11 0x008eec7f in TCint::CallFunc_Exec ()
>> #12 0x0086b002 in TQConnection::ExecuteMethod ()
>> #13 0x0086fd45 in TQObject::Emit ()
>> #14 0x00429bce in TCanvas::Selected ()
>> #15 0x048b1c62 in TParallelCoord::Draw ()
>> #16 0x03b44fd9 in XPlot::DrawParallelCoord ()
>> #17 0x03c9128f in G__xpsDict_564_0_22 ()
>> #18 0x010750c2 in Cint::G__ExceptionWrapper ()
>> #19 0x01148021 in G__execute_call ()
>> #20 0x011484ed in G__call_cppfunc ()
>> #21 0x0111bd4e in G__interpret_func ()
>> #22 0x01106d7b in G__getfunction ()
>> #23 0x0121b96b in G__getstructmem ()
>> #24 0x0121161b in G__getvariable ()
>> #25 0x010d3931 in G__getitem ()
>> #26 0x010d6839 in G__getexpr ()
>> #27 0x0117fc11 in G__exec_statement ()
>> #28 0x0111df95 in G__interpret_func ()
>> #29 0x011072f5 in G__getfunction ()
>> #30 0x010d3a74 in G__getitem ()
>> #31 0x010d6839 in G__getexpr ()
>> #32 0x010e9c79 in G__calc_internal ()
>> #33 0x0118e40c in G__process_cmd ()
>> #34 0x008f21c4 in TCint::ProcessLine ()
>> #35 0x008f0f1f in TCint::ProcessLineSynch ()
>> #36 0x008388b0 in TApplication::ExecuteFile ()
>> #37 0x0083759d in TApplication::ProcessLine ()
>> #38 0x00031799 in TRint::Run ()
>> #39 0x00001bae in main ()
>> Root> Function macroDrawProfilePlot() busy flag cleared
>>
>>
>>
>> installed.packages() indicates xps is version 1.8.3
>>
>> Thanks
>>
>> Dan
>>
>> On 22/08/2010 6:49 PM, cstrato wrote:
>>> Dear Daniel,
>>>
>>> Sorry my mistake again!
>>> After looking at my source code I realized that currently it is not
>>> possible to use a subset of trees only. Thus I have just uploaded to
>>> Bioconductor a new version "xps_1.8.3" which should solve the problem,
>>> and will be available within the next 1-2 days. You should now be able
>>> to use parameter "treename" to plot only a subset of trees.
>>>
>>> Please let me know if the new version solves your problem.
>>> Especially I am interested to know how many treenames you can pass to
>>> function root.profile() since there could be a limit on the number of
>>> characters you can pass to the root macro.
>>>
>>> Since you mention that there are too many arrays to look reasonable on
>>> one plot, you could also change parameter "w" from the default "w=800"
>>> to e.g. "w=4000", especially if you save the plot by setting e.g.
>>> "save.as='png'".
>>>
>>> Best regards
>>> Christian
>>>
>>>
>>> On 8/20/10 5:35 PM, Daniel Brewer wrote:
>>>> Hi Christian,
>>>>
>>>> I tried that, but it kicked up an error and only plotted one boxplot.
>>>> It was like "treename" could only take one parameter. Maybe I was doing
>>>> something wrong. I will have another go.
>>>>
>>>> Dan
>>>>
>>>> On 20/08/2010 4:06 PM, cstrato wrote:
>>>>> Dear Daniel,
>>>>>
>>>>> You can simply use parameter "treename" to plot only a subset of
>>>>> trees,
>>>>> see "?root.profile".
>>>>>
>>>>> Best regards
>>>>> Christian
>>>>> _._._._._._._._._._._._._._._._._._
>>>>> C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a
>>>>> V.i.e.n.n.a A.u.s.t.r.i.a
>>>>> e.m.a.i.l: cstrato at aon.at
>>>>> _._._._._._._._._._._._._._._._._._
>>>>>
>>>>>
>>>>> On 8/20/10 4:47 PM, Daniel Brewer wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I am using xps to do some quality control on an Affymetrix exon array
>>>>>> experiment I am looking at. I am trying to use the ROOT graphics to
>>>>>> plot density boxplots of the raw intensities (using root.profile).
>>>>>> The
>>>>>> problem is that there is too many arrays to look reasonable on one
>>>>>> plot.
>>>>>> Is there a way to split up the dataset into smaller pieces and plot
>>>>>> them?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Dan
>>>>>>
>>>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list