[R] replacing then summing by values from another dataframe
Gerrit Eichner
Gerrit.Eichner at math.uni-giessen.de
Tue Aug 11 09:05:47 CEST 2015
Hello, Xianming,
I have changed your (particular) data structure: use matrices because you
have only numeric scores and effects, use NA instead of -1 as missing
value (as usual), don't use columns for ids or row/column names (except
for the easy of reading the data structures), increase your score values
in dat1 by 1 to obtain valid column indices for dat2. Finally, loop (!)
rowwise through your matrix dat1 and construct an index-matrix (!) to
index dat2 (and sum up the indexed elements). Hope this does what you
want. (See below.)
The same remark regarding elegancy/efficiency applies as in Petr's
solution (but w/o an additional package ;-)).
dat1 <- cbind( c(2, 2, 1, NA, 0),
c(1, 0, 1, NA, 1),
c(0, 1, 1, NA, 0))
# dimnames( dat1) <- list( paste0( 'C', 1:5), paste0( "m", 1:3))
dat2 <- cbind( c(-19.5482, -.512, -.492),
c(.007, 3.241, -2.256),
c(1.223, -4.490, 1.779))
# rownames( dat2) <- paste0 ('m', 1:3)
apply( dat1 + 1, 1,
function( idx, d2)
sum( d2[ cbind( seq( nrow( d2)), idx)]),
d2 = dat2
)
Hth -- Gerrit
On Tue, 11 Aug 2015, Xianming Wei wrote:
> [I might have sent the following request to a wrong email address - 'r-help-request at r-project.org']
>
> Hi,
>
>
>
> I have two data frame dat1 and dat2.
>
>
>
> dat1 <- data.frame(pid = paste('C', 1:5, sep = ''),
>
> m1 = c(2, 2, 1, -1, 0),
>
> m2 = c(1, 0, 1, -1, 1),
>
> m3 = c(0, 1, 1, -1, 0))
>
> dat2 <- data.frame(mid = paste('m', 1:3, sep = ''),
>
> '0' = c(-19.5482, -.512, -.492),
>
> '1' = c(.007, 3.241, -2.256),
>
> '2' =c(1.223, -4.490, 1.779)) names(dat2)[-1] <- c('0', '1', '2')
>
>
>
> dat1 contains individuals with scores of three measurements (-1 represents missing) and dat2 with the effect of the different levels of the three measurements. What I'd like to do is to summise the effects of three measurements based on the level effects. So C1 I want to get the values of dat2 for m1 at level 2 = 1.223, m2 at level 1 = 3.241 and m3 at level 0 = -0.4920 and sum them up as 3.972.
>
>
>
> I can only think of a loop to do that at the moment. Because of much higher dimensions of actual two datasets, I need help to come up with an efficient / elegant approach.
>
>
>
> Any help is much appreciated.
>
>
>
>
>
> Regards,
> Xianming
>
>
> -------------------- Internet e-Mail Disclaimer --------------------
>
> PRIVILEGED - PRIVATE AND CONFIDENTIAL: This email and any files transmitted with it are intended solely for the use of the addressee(s) and may contain information, which is confidential or privileged. If you are not the intended recipient, be aware that any disclosure, copying, distribution, or use of the contents of this information is prohibited. In such case, you should destroy this message and kindly notify the sender by reply e-mail. The views and opinions expressed in this e-mail are those of the sender and do not necessarily reflect the views of the company.
>
> VIRUSES: Email transmission cannot be guaranteed to be secure or error free, as information may be intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses. This email and any files attached to it have been checked with virus detection software before transmission. You should nonetheless carry out your own virus check before opening any attachment. Sugar Research Australia Limited does not represent or warrant that files attached to this email are free from computer viruses or other defects and accepts no liability for any loss or damage that may be caused by software viruses
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list