[R] apply with multiple conditions

pguilha paul.guilhamon at gmail.com
Mon Jul 2 13:25:13 CEST 2012


Hello all,

I have written a for loop to act on a dataframe with close to 3million rows
and 6 columns and I would like to pass it to apply() to speed the process up
(I let the loop run for 2 days before stopping it and it had only gone
through 200,000 rows) but I am really struggling to find a way to pass the
arguments. Below are the loop and the head of the dataframe I am working on.
Any hints would be much appreciated, thank you! (I have searched for this
but could not find any other posts doing quite what I want)
Paul

x<-as.numeric(all.tf7[1,2])
for (i in 2:nrow(all.tf7)) {
  if (all.tf7[i,1]==all.tf7[i-1,1] & (all.tf7[i,2]-x)<115341)
all.tf7[i,6]<-all.tf7[i-1,6]
  else if (all.tf7[i,1]==all.tf7[i-1,1] & (all.tf7[i,2]-x)>=115341) {
    all.tf7[i,6]<-(all.tf7[i-1,6]+1)
    x<-as.numeric(all.tf7[i,2]) }
  else if (all.tf7[i,1]!=all.tf7[i-1,1])  {
    all.tf7[i,6]<-(all.tf7[i-1,6]+1)
    x<-as.numeric(all.tf7[i,2]) } 
}

#the aim here is to attribute a bin number to each row so that I can then
split the dataframe according to those bins.


chrom chromStart chromEnd         name cumsum bin
chr1      10089             10309               ZBTB33  10089   1
chr1      10132             10536      TAF7_(SQ-8)  20221   1
chr1      10133             10362            Pol2-4H8  30354   1
chr1      10148             10418  MafF_(M8194)  40502   1
chr1      10382             10578                ZBTB33  50884   1
chr1      16132             16352                    CTCF  67016   1

--
View this message in context: http://r.789695.n4.nabble.com/apply-with-multiple-conditions-tp4635098.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list