[R] function similar to ddply? + calculations based on previous row
Nerak
nerak.t at hotmail.com
Wed Feb 15 17:02:44 CET 2012
Hi all,
I was wondering if there is a function kind of similar that splits a
dataframe, applies a function to each row and returns in a data frame. I
know ddply but this one isn’t useful in this situation.
I have a dataframe with values for each day (rows) for different objects
(columns). I have values for several years. Now, I want to do calculations
on only the data of that year. With the ddply function you can use as second
argument Year to split the data frame into years. But the function you use
is for that whole part so you get only one output value for that year (for
example the sum all the values in that column belonging to that year). I
want to calculate a new value for each day of that year (what would be
possible with apply if you have only data for one year). I found another way
to do this by using a for loop (y in Year[,]:Year[length(Year)] and
test2.dataframe<-test.dataframe[which(year==y)] to select the rows belonging
to that year on which I make to calculations. The problem is that for loops
take a lot of time to run and I’m trying to avoid using them whenever
possible. (Example almost reproducible script below)
I’m also wondering if it’s possible to refer to a value of the row below
from another data vector or data frame or …. The line I mean in the script
below is this one (and is the one that is the course that the script doesn’t
work because that n is not known):
test.number$numberb[y-Year[1]+1]<-length(which(test.starty==1 &
test.f[(n+1)]== 1 ))
I want that for a certain row, the according value of test.starty (on the
row with the same number (e.g. n) ) = 1 and the according value of the row
below row n of test.f ==1. How can I do this without having to loop (which I
want to learn to avoid as much as possible). I tried to search on Rhelpforum
already and found:
http://r.789695.n4.nabble.com/How-to-calc-ratios-base-on-current-and-previous-row-td2341407.html
My n+1 is based on the original value so there is should be a solution
without looping but I don’t understand how I should index…
I’ll illustrate what I mean with a loop to solve this kind (different
script) of problems:
test[1,]<-ifelse(AAA[1,]>1,1,0)
for (t in 2:10)
{
test <- ifelse(AAA[t,]>1 & AAA[t-1,]==0,1,0)
}
Below you can see how I did it with the for loop and what I want to create:
Year<-data.frame(Date=c(1980,1980,1980,1980,1981,1981,1981,1981,1982,1982,1982,1982,1983,1983,1983,1983))
test.b<-data.frame(C=c(0,0,0,0,5,2,0,0,0,15,12,10,6,0,0,0),B=c(0,0,0,0,9,6,2,0,0,24,20,16,2,0,0,0),F=c(0,0,0,0,6,5,1,0,0,18,16,12,10,5,1,0))
test.start<-data.frame(C=c(0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0),B=c(0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0),F=c(0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0))
test.2b<-test.b>1
test.number<-data.frame(c(1980:1983))
for (l in 1:nrow(test.b))
{
for (y in 1980:1983)
{
test.f<-test.2b[which(Year == y),l]
test.starty<-test.start[which(Year ==y),l]
test.number$numberb[y-Year[1]+1]<-length(which(test.starty==1 &
test.f[(n-1)]== 1 ))
}
test.number[,l+1]<-cbind(test.number$numberb)
}
If someone knows a way to get rid of the loops, let me know! Because I want
to make this script as fast as possible for larger datasets. I'm trying to
get through the apply family to find solutions but it's a hard issue.
Many thanks in advance,
Kind regards,
Nerak
--
View this message in context: http://r.789695.n4.nabble.com/function-similar-to-ddply-calculations-based-on-previous-row-tp4390925p4390925.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list