[R] question on more efficient data block-match processing
Mckinstry, Craig
Craig.Mckinstry at cambiahealth.com
Thu Mar 6 20:23:03 CET 2014
I have a medical insurance claims datafile divided into blocks by member, with multiple lines per member. I am process these into a one line per member model matrix. Member block sizes vary from 1 to 50+. I am match attributes in claims data to columns in the model matrix and
have been getting by with a for loop, but for large file size it takes much too long. Is there vectorized/apply based method to do this more efficiently?
input data:
member code
1 A
1 C
1 F
2 B
2 E
3 D
3 A
3 B
3 D
4 G
4 A
code.list <- c(A,B,C,D,E)
for(i in 1:n.mbr){
mbr.i <- dat[dat$Rmbr==mbr.list[i],] #EXTRACT BLOCK OF MEMBER CLAIMS
matrix.mat[i,unique(match(mbr.i$code,code.list))] <- 1
}
output model.matrix
Member A B C D E
1 1 0 1 0 0
2 0 1 0 0 1
3 1 1 0 1 0
4 1 0 0 0 0
Craig McKinstry
100 Market, 6th floor
Office: 503-225-6878 | Cell: 509-778-2438
IMPORTANT NOTICE: This communication, including any attachment, contains information that may be confidential or privileged, and is intended solely for the entity or individual to whom it is addressed. If you are not the intended recipient, you should delete this message and are hereby notified that any disclosure, copying, or distribution of this message is strictly prohibited. Nothing in this email, including any attachment, is intended to be a legally binding signature.
More information about the R-help
mailing list