[R] Mutliple subsetting of a dataframe based on many conditions

arun smartpink111 at yahoo.com
Sun Apr 7 01:51:33 CEST 2013


Hi,

May be this helps:
input1<- data.frame(answer=rep(1:4,times=18),p.number=rep(1:18,each=4),session=rep(1:2,each=36),count=rep(1:8,each=9),type=rep(1:3,each=24))
input2<- data.frame(answer=rep(1:4,times=18),p.number=rep(1:18,each=4),session=rep(1:2,each=36),count=rep(1:8,each=9),type=rep(1:3,each=24))
inputNew<- rbind(input1,input2)

indx1<- apply(inputNew,1,paste0,collapse="")
res<- lapply(split(indx1,indx1),function(x) inputNew[apply(inputNew,1,paste0,collapse="")%in%x,])

 res[1:4]
#$`110252`
 #   answer p.number session count type
#37       1       10       2     5    2
#109      1       10       2     5    2

#$`11111`
 #  answer p.number session count type
#1       1        1       1     1    1
#73      1        1       1     1    1

#$`111252`
 #   answer p.number session count type
#41       1       11       2     5    2
#113      1       11       2     5    2

#$`112252`
 #   answer p.number session count type
#45       1       12       2     5    2
#117      1       12       2     5    2
A.K.



Currently I have a dataframe with 18 columns. I would like to subset the data in one of these columns, "present", according to combinations of 
data in six of the other columns within the data frame and then save 
this into a text file.
The columns I would like to use to subset "present" are:

* answer (1:4) [answer takes the values 1 to 4] 
*p.num (1:18)
* session (1:2)
* count (1:8)
* type (1:3)
 
So there are a total of 3456 possible subsetting combinations.

At present, I have been using the following and manually changing the values in each line and re-running the code.

input<-subset(input, answer==1)
input.s2g<-subset(input, p.num == 1)
input.s2g<-subset(input.s2g, session == "S2")
input.s2g<-subset(input.s2g, count==8)
input.s2g<-subset(input.s2g, type==1)

write.table(s2g, file = "1_1_S2_8_1", sep = "\t", col.names = F, row.names = F)
 
But this takes hours and is obviously prone to error. There must be an easier way? 

Thanks for the help!



More information about the R-help mailing list