[R] Trying to get the prior value of a record from a data.frame . . . data.frame

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Sat Nov 30 21:21:01 CET 2024


I assume that the responses that John already received to his recent
post met his needs. However, when I read it, I had a slightly
different interpretation. So feel free to ignore the rest of this post
if you like, but here's my interpretation and a simple solution to it.

An example to help explain:

set.seed(453)
df <- data.frame(
      group = sample(letters[1:4],30, rep = TRUE),
      gender = sample(c("M", "F", "NB"), 30, rep = TRUE),
      value = 1:30
)

df is a data frame with 30 records/rows of 3 columns giving a group
identifier, gender, and value for each record. I kept the values
artificially simple to (I hope) make it easier to understand the
problem and my solution.

The problem: create a new column, prev.value, that gives the value of
the previous record that has the same group and gender as the current
record if any such exist; or NA if no such previous id and gender
combination occur.
I think this is a slightly more complex task than John's original
request, but it is actually straightforward even in base R -- and
undoubtedly also using similar functionality in the Tidyverse or other
packages.

Here is my solution (using the R pipe, "|>", syntax):

df$prev.value <- with(df, {
   f <- paste0(group,gender) ## a simple 'hash' to identify the combinations
      ## "shift" the values for each group defined by f:
   f |> tapply(value, INDEX = _, FUN = \(x)c(NA, head(x, -1)))  |>
      ## reassemble according to f:
   unsplit(f)
})
df


Cheers,
Bert


On Thu, Nov 28, 2024 at 5:25 PM Sorkin, John <jsorkin using som.umaryland.edu> wrote:
>
> I need to write code that will give me the previous value of from a data.frame. I have written the following code using the shift function from data.table . It does not work. I hope someone can help me correct the code.
> ###########################
> # Try to understand shift #
> ###########################
> if(!require(data.table)) install.packages("data.table")
> library(data.table)
> # Create data
> x <- data.frame(Id=rep(1:10),num=rep(11:20))
> cat("This is the input data.frame used in the code below","\n")
> x
>
> for (i in 1:10) {
>   cat("x[i,num]",x[i,"num"],"\n")
>   # Get previous value of x[i,"num"]
>   zoop<-shift(x[i,"num"], n=1L, type="lag")
>   cat("Previous value of x[,num]=",zoop,"\n")
>   }
> ###############################
> # END Try to understand shift #
> ###############################
>
> Thank you,
> John
>
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine, University of Maryland School of Medicine;
> Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center;
> PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center;
> Senior Statistician University of Maryland Center for Vascular Research;
>
> Division of Gerontology and Paliative Care,
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> Cell phone 443-418-5382
>
>
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list