[BioC] problem about set operation and computation after split

bestbird7788 [guest] guest at bioconductor.org
Thu Jun 7 10:36:07 CEST 2012


hi,
    I met some problems in R, please help me.
1. How to do a intersect operation among several groups in one list, without a loop statement? (I think It may be a list)
   create data:
   myData <- data.frame(product = c(1,2,3,1,2,3,1,2,2), year=c(2009,2009,2009,2010,2010,2010,2011,2011,2011),value=c(1104,608,606,1504,508,1312,900,1100,800))
   mySplit<- split(myData,myData$year)  
   mySplit
$`2009`
  product year value
1       1 2009  1104
2       2 2009   608
3       3 2009   606

$`2010`
  product year value
4       1 2010  1504
5       2 2010   508
6       3 2010  1312

$`2011`
  product year value
7       1 2011   900
8       2 2011  1100
9       2 2011   800
    I want to get intersection of product between every year. I know the basic is:
    intersect(intersect(mySplit[[1]]$product, mySplit[[2]]$product),mySplit[[3]]$product)   
    this will give the correct answer:
    [1] 1 2
    above code lacks reusability, so It should use a for loop:
    myIntersect<-mySplit[[1]]$product
    for (i in 1:length(mySplit)-1){ 
        myIntersect<-intersect(myIntersect,mySplit[[i+1]]$product)
    }
    It's correct too, but stll too complex, so my question is:
    Can I do the same thing just use another similar intersect function (without for/repeat/while).
    What's this simple function's name ?

2.how to do a relative computation after split (notice: not befor split)?
   create data:
   myData1 <- data.frame(product = c(1,2,3,1,2,3), year=c(2009,2009,2009,2010,2010,2010),value=c(1104,608,606,1504,508,1312),relative=0)
   mySplit1<- split(myData1,myData1$year)  
   mySplit1
$`2009`
  product year value relative
1       1 2009  1104        0
2       2 2009   608        0
3       3 2009   606        0

$`2010`
  product year value relative
4       1 2010  1504        0
5       2 2010   508        0
6       3 2010  1312        0
   I want compute relative value in the every group, what I mean is , I want get the result is just like below:
   $`2009`
  product year value relative
1       1 2009  1104        0
2       2 2009   608        -496
3       3 2009   606        -2

$`2010`
  product year value relative
4       1 2010  1504        0
5       2 2010   508        -996
6       3 2010  1312        804
I think to use a loop maybe work, but Is there no direct method on list?

3.how to do a sorting after split, It's just like above question, what I want is sorting by value:
   $`2009`
  product year value relative
3       3 2009   606        0
2       2 2009   608        0
1       1 2009  1104        0
$`2010`
  product year value relative
5       2 2010   508        0
6       3 2010  1312        0
4       1 2010  1504        0

4. how to do a filtering after split, Yes, It's just like above quetion, what I want is filtering out data which value is more than 1000:
$`2009`
  product year value relative
1       1 2009  1104        0
$`2010`
  product year value relative
4       1 2010  1504        0
6       3 2010  1312        0

 -- output of sessionInfo(): 

R version 2.15.0 Patched (2012-04-26 r59206)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

loaded via a namespace (and not attached):
[1] tools_2.15.0

--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list