[R] How to write efficient R code

Liaw, Andy andy_liaw at merck.com
Wed Feb 18 00:35:21 CET 2004


I'm guessing what Sebatian want is to do the differencing by a stratifying
variable such as ID; e.g., the data may look like:

df <- as.data.frame(cbind(ID=rep(1:5, each=3), x=matrix(rnorm(45), 15, 3))

So using Tom's solution, one would do something like:

mdiff <- function(x) x[-1,] - x[nrow(x),]
sapply(split(df[,-1], df[,1]), mdiff)

There could well be more efficient ways!

Andy

> From: Tom Blackwell
> 
> Sebastian  -
> 
> For successive differences within a single column 'x'
> 
> differences <- c(NA, diff(x)),
> 
> same as
> 
> differences <- c(NA, x[-1] - x[-length(x)]).
> 
> See  help("diff"), help("Subscript").  The second version also
> works when  x  is a matrix or a data frame, except now the result
> is a matrix or data frame of the same size.
> 
> x <- data.frame(matrix(rnorm(1e+5), 1e+4))
> dim(x)               # 10000    10
> differences <- rbind(rep(NA, 10), x[-1, ] - x[-dim(x)[1], ])
> dim(differences)     # 10000    10
> 
> However, you write "I need to do this for all the subsets of data
> created by the numbers in one of the columns of the data frame ..."
> and I'm not sure I understand how an 'id' column would create many
> subsets of the data.  So the simple examples above may not answer
> the question you are asking.
> 
> -  tom blackwell  -  u michigan medical school  -  ann arbor  -
> 
> On Tue, 17 Feb 2004, Sebastian Luque wrote:
> 
> > Hi,
> >
> > In fact, I've been trying to get rid of loops in my code for more
> > than a week now, but nothing I try seems to work. It sounds as if
> > you have lots of experience with loops, so would appreciate any
> > pointers you may have on the following.
> >
> > I want to create a column showing the difference between the ith
> > row and i-1. Of course, the first row won't have any value in it,
> > because there is nothing above it to subtract to. This is fairly
> > easy to do with a simple loop, but I need to do this for all the
> > subsets of data created by the numbers in one of the columns of
> > the data frame (say, an id column). I would greatly appreciate
> > any idea you may have on this.
> >
> > Thanks in advance.
> >
> > Best regards,
> > Sebastian
> > --
> >   Sebastian Luque
> >
> > sluque at mun.ca
> >
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}




More information about the R-help mailing list