[R] Optimize for loop / find last record for each person
Andrew Ziem
ahz001 at gmail.com
Fri Feb 27 23:47:24 CET 2009
On Fri, Feb 27, 2009 at 2:10 PM, William Dunlap <wdunlap at tibco.com> wrote:
> Andrew, it makes it easier to help if you supply a typical
> input and expected output along with your code. I tried
> your code with the following input:
I'll be careful to avoid these mistakes. Also, I should not have used
a reserved word for the variable history, and I should have mentioned
the data is sorted with the most recent dates first. Talk about a bad
day! :)
Originally I omitted this code before the for loop:
history["order"] <- NA
history[1,"order"] = 1
Here's a sample data set:
history_ <- data.frame(person_id=list(c(1,2,2)),date_=list(c("2009-01-01","2009-02-03","2009-02-02")),
x=list(c(0.01,0.05,0.06)) )
colnames(history_) <- c("person_id", "date_","x")
history_
Jorge's suggestion[1] works for me, and it seems much faster. I
simply adapted it by replacing Jorge's variable x with a sequential
identifier already in the database.
[1] https://stat.ethz.ch/pipermail/r-help/2009-February/189981.html
> The following function, f2, does what I think you are saying
> you want. It sorts the data by person_id, breaking ties with
> date, and then selects the rows where the person_id entry does
My data is already sorted by the SQL database like this
ORDER BY person_id, date_ DESC
Thanks everyone for responding and expanding my knowledge of R!
Best regards,
Andrew
More information about the R-help
mailing list