[R] R Processing dataframe by group - equivalent to SAS by group processing with a first. and retain statments

Sorkin, John j@ork|n @end|ng |rom @om@um@ry|@nd@edu
Wed Nov 27 17:30:34 CET 2024


I am an old, long time SAS programmer. I need to produce R code that processes a dataframe in a manner that is equivalent to that produced by using a by statement in SAS and an if first.day statement and a retain statement:

I want to take data (olddata) that looks like this
ID	Day
1	1
1	1
1	2
1	2
1	3
1	3
1	4
1	4
1	5
1	5
2	5
2	5
2	5
2	6
2	6
2	6
3	10
3	10

and make it look like this:
(withing each ID I am copying the first value of Day into a new variable, FirstDay, and propagating the FirstDay value through all rows that have the same ID:

ID	Day	FirstDay
1	1	1
1	1	1
1	2	1
1	2	1
1	3	1
1	3	1
1	4	1
1	4	1
1	5	1
1	5	1
2	5	5
2	5	5
2	5	5
2	6	5
2	6	5
2	6	5
3	10	3
3	10	3

SAS code that can do this is:

proc sort data=olddata;
  by ID Day;
run;

data newdata;
  retain FirstDay;
  set olddata;
  by ID;
  if first.ID then FirstDay=Day;
run;

I have NO idea how to do this is R (so I can't post test-code), but below I have R code that creates olddata:

ID <- c(rep(1,10),rep(2,6),rep(3,2))
date <- c(rep(1,2),rep(2,2),rep(3,2),rep(4,2),rep(5,2),
          rep(5,3),rep(6,3),rep(10,2))
date
olddata <- data.frame(ID=ID,date=date)
olddata

Any suggestions on how to do this would be appreciated. . . I have worked on this for more than 12-hours, despite multiple we searches I have gotten nowhere. . . 

Thanks
John




John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; 
PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382





More information about the R-help mailing list