[R] Subsetting isolating a group of values in a group of variables
arun
smartpink111 at yahoo.com
Sat Sep 7 22:12:33 CEST 2013
Hi,
The expected output is not clear.
dat1<- read.table(text="ID diag1 diag2 diag3 proc1 proc2 proc3
1 k23 i269 j123 u123 u456 u123
2 k69 i80 u456 z456 z123 z456
3 l91 i801 g678 u456 u123 u123
4 i80 i90 h983 z123 z456 z456",sep="",header=TRUE,stringsAsFactors=FALSE)
vec1<- c("i80","i90","l91")
subset(dat1,diag1%in%vec1|diag2%in%vec1|diag3%in% vec1)
# ID diag1 diag2 diag3 proc1 proc2 proc3
#2 2 k69 i80 u456 z456 z123 z456
#3 3 l91 i801 g678 u456 u123 u123
#4 4 i80 i90 h983 z123 z456 z456
##Creating another data frame with codes and diagnosis
dat2<-data.frame(code=unique(unlist(dat1[,2:4])),diag=c(rep("Broken finger",2),rep("Broken toe",2),rep("Broken legs",2),"Broken toe",rep("Broken foot",2),rep("Broken rib",2)),stringsAsFactors=FALSE)
lst1<- lapply(split(dat2,dat2$diag), function(x) {x1<- x$code;x2<- subset(dat1,diag1%in%x1|diag2%in%x1|diag3%in%x1);cbind(x2,Diag=x$diag)})
lst1
#$`Broken finger`
# ID diag1 diag2 diag3 proc1 proc2 proc3 Diag
#1 1 k23 i269 j123 u123 u456 u123 Broken finger
#2 2 k69 i80 u456 z456 z123 z456 Broken finger
#
#$`Broken foot`
# ID diag1 diag2 diag3 proc1 proc2 proc3 Diag
#1 1 k23 i269 j123 u123 u456 u123 Broken foot
#2 2 k69 i80 u456 z456 z123 z456 Broken foot
#
#$`Broken legs`
# ID diag1 diag2 diag3 proc1 proc2 proc3 Diag
#1 1 k23 i269 j123 u123 u456 u123 Broken legs
#3 3 l91 i801 g678 u456 u123 u123 Broken legs
#$`Broken rib`
# ID diag1 diag2 diag3 proc1 proc2 proc3 Diag
#3 3 l91 i801 g678 u456 u123 u123 Broken rib
#4 4 i80 i90 h983 z123 z456 z456 Broken rib
#$`Broken toe`
# ID diag1 diag2 diag3 proc1 proc2 proc3 Diag
#2 2 k69 i80 u456 z456 z123 z456 Broken toe
#3 3 l91 i801 g678 u456 u123 u123 Broken toe
#4 4 i80 i90 h983 z123 z456 z456 Broken toe
A.K.
Hello.
I have date frame structured like this:
ID diag1 diag2 diag3 proc1 proc2 proc3
1 k23 i269 j123 u123 u456 u123
2 k69 i80 u456 z456 z123 z456
3 l91 i801 g678 u456 u123 u123
4 i80 i90 h983 z123 z456 z456
Each observation has a group of diagnostics codes(diag) and procedure codes(proc).
A single diagnosis maybe be described by more than one code eg broken toe maybe coded for by i80,i90,l91 or more.
My aim to subset all rows with any of the codes representing a
single diagnosis. So i would like to use multiple values (i80,i90,l91=
broken toe) applied to specific columns, ie diag1,2 and 3 to isolate
those rows which contain any of the specified codes.
Your help would be greatly appreciated.
More information about the R-help
mailing list