[BioC] subset in XPS

cstrato cstrato at aon.at
Fri Jul 4 17:43:14 CEST 2008


Dear Zhibin

Finally, I am able to confirm your results.  The problem appears already at:
 > str(Data.rma)

This is caused by one of the "features" of R.

Since I need to export the resulting table from the root file and import 
it again with:
 > ds <- read.table("Test.txt", header=TRUE, sep="\t", row.names=NULL);
 > head(ds)
R will check for correct colnames by function make.names(), and will do:
"The character "X" is prepended if necessary" if the name starts with a 
number!!!

Thus, I need to replace this line everywhere in my program with:
 > ds <- read.table("Test.txt", header=TRUE, check.names=FALSE, 
sep="\t", row.names=NULL);

I will let you know once I have uploaded the new version,  however, I 
need to do some testing to make sure everything is ok.

Thank you for reporting this problem.

Best regards
Christian


Zhibin Lu wrote:
> Dear Christian,
>
> I tested the new package with my linux box, so I compiled ROOT from source with the file 'root_v5.18.00.source.tar.gz'. Since I was using Linux, I think xps was also compiled from source. I followed your direction, it seems that the function 'exprs' gave the extra X. Here is what I got:
>
>   
>> treenames<-treeNames(Data.rma)
>> treenames
>>     
> [[1]]
> [1] "10017_12.mdp"
>
> [[2]]
> [1] "100M_11.mdp"
>
> [[3]]
> [1] "11017_4.mdp"
>
> [[4]]
> [1] "110M_3.mdp"
>
> [[5]]
> [1] "11117_6.mdp"
>
> [[6]]
> [1] "111M_5.mdp"
>
> [[7]]
> [1] "9617_2.mdp"
>
> [[8]]
> [1] "96M_1.mdp"
>
> [[9]]
> [1] "9717_8.mdp"
>
> [[10]]
> [1] "97M_7.mdp"
>
> [[11]]
> [1] "9817_10.mdp"
>
> [[12]]
> [1] "98M_9.mdp"
>
>   
>> treenames=getTreeNames("rma_all.root")
>> treenames
>>     
>  [1] "10017_12.rbg" "10017_12.int" "100M_11.rbg"  "100M_11.int"  "11017_4.rbg" 
>  [6] "11017_4.int"  "110M_3.rbg"   "110M_3.int"   "11117_6.rbg"  "11117_6.int" 
> [11] "111M_5.rbg"   "111M_5.int"   "9617_2.rbg"   "9617_2.int"   "96M_1.rbg"   
> [16] "96M_1.int"    "9717_8.rbg"   "9717_8.int"   "97M_7.rbg"    "97M_7.int"   
> [21] "9817_10.rbg"  "9817_10.int"  "98M_9.rbg"    "98M_9.int"    "10017_12.cqu"
> [26] "100M_11.cqu"  "11017_4.cqu"  "110M_3.cqu"   "11117_6.cqu"  "111M_5.cqu"  
> [31] "9617_2.cqu"   "96M_1.cqu"    "9717_8.cqu"   "97M_7.cqu"    "9817_10.cqu" 
> [36] "98M_9.cqu"    "10017_12.mdp" "100M_11.mdp"  "11017_4.mdp"  "110M_3.mdp"  
> [41] "11117_6.mdp"  "111M_5.mdp"   "9617_2.mdp"   "96M_1.mdp"    "9717_8.mdp"  
> [46] "97M_7.mdp"    "9817_10.mdp"  "98M_9.mdp"   
>
>   
>> treenames=getTreeNames("rma_all.root", "mdp")
>> treenames
>>     
>  [1] "10017_12.mdp" "100M_11.mdp"  "11017_4.mdp"  "110M_3.mdp"   "11117_6.mdp" 
>  [6] "111M_5.mdp"   "9617_2.mdp"   "96M_1.mdp"    "9717_8.mdp"   "97M_7.mdp"   
> [11] "9817_10.mdp"  "98M_9.mdp"
>
>   
>> value<-exprs(Data.rma)
>> treenames<-colnames(value)
>> treenames
>>     
>  [1] "UNIT_ID"             "UnitName"            "X10017_12.mdp_LEVEL"
>  [4] "X100M_11.mdp_LEVEL"  "X11017_4.mdp_LEVEL"  "X110M_3.mdp_LEVEL"  
>  [7] "X11117_6.mdp_LEVEL"  "X111M_5.mdp_LEVEL"   "X9617_2.mdp_LEVEL"  
> [10] "X96M_1.mdp_LEVEL"    "X9717_8.mdp_LEVEL"   "X97M_7.mdp_LEVEL"   
> [13] "X9817_10.mdp_LEVEL"  "X98M_9.mdp_LEVEL" 
>
>   
>> treenames <- treenames[c(4,6)]
>> sub.rma<-Data.rma
>> exprs(sub.rma, treenames) <- value
>> str(sub.rma)
>>     
>
>   ..@ treenames:List of 2
>   .. ..$ : chr "X100M_11.mdp"
>   .. ..$ : chr "X110M_3.mdp"
>
>
> Regards,
>
> Zhibin
>
>   
>> Date: Thu, 3 Jul 2008 21:58:38 +0200
>> From: cstrato at aon.at
>> To: zhbluweb at hotmail.com
>> CC: bioconductor at stat.math.ethz.ch
>> Subject: Re: [BioC] subset in XPS
>>
>> Dear Zhibin
>>
>> In general, xps offers two ways to get the treenames from an ExprTreeSet:
>>
>> 1. method treeNames applied to the ExprTreeSet:
>>     
>>> treenames <- treeNames(data.rma)
>>> treenames
>>>       
>> 2. function getTreeNames applied to the root file directly:
>>     
>>> treenames <- getTreeNames("Test3RMA.root")
>>> treenames
>>> treenames <- getTreeNames("Test3RMA.root","mdp")
>>> treenames
>>>       
>> Then you can select the treenames of interest by doing:
>>     
>>> treenames <- treenames[c(2,4)]
>>>       
>> Of course, the following also works:
>>     
>>> value <- exprs(data.rma)
>>> treenames <- colnames(value)
>>> treenames
>>> treenames <- treenames[c(4,6)]
>>>       
>> In any case you get the subset:
>>     
>>> sub.rma <- data.rma
>>> exprs(subset.rma, treenames) <- value
>>> str(sub.rma)
>>>       
>> which results in:
>> ..@ treenames:List of 2
>> .. ..$ : chr "TestA2.mdp"
>> .. ..$ : chr "TestB2.mdp"
>>
>> Could you please send me the code you used for subsetting, which
>> resulted in an "X" in front of treenames.
>>
>> Do above mentioned solutions result in the same error?
>>
>> Furthermore, could you give me the following information:
>> - which version of ROOT did you install?
>> - did you install the ROOT binary or compile from source?
>> - did you download/install the xps binary or compile from source?
>>
>> P.S.:
>> I am glad to hear that running R through Terminal on your Mac works fine.
>>
>> Best regards
>> Christian
>>
>>
>> Zhibin Lu wrote:
>>     
>>> Dear Christian,
>>>
>>> When I tried to use str(sub.rma) to check the sub set, I found there was an extra 'X' in front of each tree name. For the example you provided, the treenames were:
>>> .. at treenames:List of 2
>>> .. ..$ : chr "XTestA2.mdp"
>>> .. ..$ : chr "XTestB1.mdp"
>>>
>>> When I applied filters to the sub.rma, I got an error "ERROR: Could not get tree .". After I changed the treenames manually, the error was gone.
>>>
>>> I was running R 2.8/BioC 2.3 and xps 1.1.2 under Ubuntu 8.04.
>>>
>>> Regards,
>>>
>>> Zhibin
>>>
>>>
>>>       
>
> _________________________________________________________________
>
>
>
>



More information about the Bioconductor mailing list