[R] from list to dataframe

Stephen D. Weigand weigand.stephen at charter.net
Thu May 19 05:13:27 CEST 2005


On May 18, 2005, at 5:39 PM, sms13+ at pitt.edu wrote:

> I was wondering if someone can help me figure out the following:
> I have two patient datasets, ds1 and ds2.  ds1 has fields "patid", 
> "date", and "lab1".  ds2 has "patid", "date", and "lab2".  I want to 
> find all the patids that have at least 2 dated records for each lab.  
> I started by splitting each dataset by patid, to create ds1.list and 
> ds2.list.  Then I did some processing (with sapply) to each list to 
> get the lengths of each patient list item.  Then I kind of lost my way 
> and things got messy as I tried to extract just the patids of those 
> with lengths >= 2, convert them to dataframes (which I didn't have 
> much success with), and then merge the two dataframes to get a vector 
> of the desired patids.  Any help would be much appreciated.
>
> Thanks,
> Steven

Steven,

I might not exactly understand your problem, but for
what it's worth, you could try to identify the patients
in ds1 who appear at least twice and identify the patients
in ds2 who appear at least twice via

ptid1 <- c("A", "A", "B", "C", "D", "D")
keep1 <- names(table(ptid1))[table(ptid1) >= 2]
keep1

or if ptid is numeric

ptid1 <- c(1, 1, 2, 3, 4, 4)
keep1 <- as.numeric(names(table(ptid1))[table(ptid1) >= 2])
keep1

then subset the respective data sets via

ds1.keep <- subset(ds1, ptid %in% intersect(keep1, keep2))
ds2.keep <- subset(ds2, ptid %in% intersect(keep1, keep2))

then use merge().

Good luck!

Stephen




More information about the R-help mailing list