[BioC] problem about set operation and computation after split
bestbird7788 [guest]
guest at bioconductor.org
Thu Jun 7 10:36:07 CEST 2012
hi,
I met some problems in R, please help me.
1. How to do a intersect operation among several groups in one list, without a loop statement? (I think It may be a list)
create data:
myData <- data.frame(product = c(1,2,3,1,2,3,1,2,2), year=c(2009,2009,2009,2010,2010,2010,2011,2011,2011),value=c(1104,608,606,1504,508,1312,900,1100,800))
mySplit<- split(myData,myData$year)
mySplit
$`2009`
product year value
1 1 2009 1104
2 2 2009 608
3 3 2009 606
$`2010`
product year value
4 1 2010 1504
5 2 2010 508
6 3 2010 1312
$`2011`
product year value
7 1 2011 900
8 2 2011 1100
9 2 2011 800
I want to get intersection of product between every year. I know the basic is:
intersect(intersect(mySplit[[1]]$product, mySplit[[2]]$product),mySplit[[3]]$product)
this will give the correct answer:
[1] 1 2
above code lacks reusability, so It should use a for loop:
myIntersect<-mySplit[[1]]$product
for (i in 1:length(mySplit)-1){
myIntersect<-intersect(myIntersect,mySplit[[i+1]]$product)
}
It's correct too, but stll too complex, so my question is:
Can I do the same thing just use another similar intersect function (without for/repeat/while).
What's this simple function's name ?
2.how to do a relative computation after split (notice: not befor split)?
create data:
myData1 <- data.frame(product = c(1,2,3,1,2,3), year=c(2009,2009,2009,2010,2010,2010),value=c(1104,608,606,1504,508,1312),relative=0)
mySplit1<- split(myData1,myData1$year)
mySplit1
$`2009`
product year value relative
1 1 2009 1104 0
2 2 2009 608 0
3 3 2009 606 0
$`2010`
product year value relative
4 1 2010 1504 0
5 2 2010 508 0
6 3 2010 1312 0
I want compute relative value in the every group, what I mean is , I want get the result is just like below:
$`2009`
product year value relative
1 1 2009 1104 0
2 2 2009 608 -496
3 3 2009 606 -2
$`2010`
product year value relative
4 1 2010 1504 0
5 2 2010 508 -996
6 3 2010 1312 804
I think to use a loop maybe work, but Is there no direct method on list?
3.how to do a sorting after split, It's just like above question, what I want is sorting by value:
$`2009`
product year value relative
3 3 2009 606 0
2 2 2009 608 0
1 1 2009 1104 0
$`2010`
product year value relative
5 2 2010 508 0
6 3 2010 1312 0
4 1 2010 1504 0
4. how to do a filtering after split, Yes, It's just like above quetion, what I want is filtering out data which value is more than 1000:
$`2009`
product year value relative
1 1 2009 1104 0
$`2010`
product year value relative
4 1 2010 1504 0
6 3 2010 1312 0
-- output of sessionInfo():
R version 2.15.0 Patched (2012-04-26 r59206)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices datasets utils methods base
loaded via a namespace (and not attached):
[1] tools_2.15.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list