[R] Selecting cases from matrices stored in lists

mdvaan mathijsdevaan at gmail.com
Mon Aug 22 15:45:23 CEST 2011


Jean V Adams wrote:
> 
>> [R] Selecting cases from matrices stored in lists
>> mdvaan 
>> to:
>> r-help
>> 08/22/2011 07:24 AM
>> 
>> Hi,
>> 
>> I have two lists (c and h - see below) containing matrices with similar
>> cases but different values. I want to split these matrices into multiple
>> matrices based on the values in h. So, I did the following:
>> 
>> years<-c(1997:1999) 
>> for (t in 1:length(years)) 
>>         { 
>>         year=as.character(years[t]) 
>>         h[[year]]<-sapply(colnames(h[[year]]), function(var)
>> h[[year]][h[[year]][,var]>0, h[[year]][var,]>0]) 
>>         } 
>> 
>> Now that I have created list h (with split matrices), I would like to 
> use
>> these selections to make similar selections in list c. List c needs to 
> get
>> the exact same shape as h, so that `8026`in 1997 (c$`1997`$`8026`) looks
>> like this: 
>> 
>> $`1997`$`8026` 
>>       B 
>> B      8025 8026 8029 
>>   8025   1.0000000 0.7739527 0.9656091 
>>   8026   0.7739527 1.0000000 0.7202771 
>>   8029   0.9656091 0.7202771 1.0000000 
>> 
>> Can anyone help me doing this? I have no idea how I can get it to work.
>> Thank you very much for your help! 
>> 
> 
> Try this:
> 
> c2 <- h
> years <- names(h)
> for (t in seq(years))
>         { 
>         year <- years[t]
>         c2[[year]] <- sapply(colnames(h[[year]]), function(var) 
>                 c[[t]][h[[year]][ ,var] > 0, h[[year]][var, ] > 0]) 
>         }
> 
> By the way, it's great that you included code in your question.
> However, I encountered a couple of errors when running you code (see 
> below).
> 
> Also, it would be better to use a different name for your list "c", 
> because c() is a function in R.
> 
> Jean
> 
>> 
>> library(zoo) 
>> DF1 = data.frame(read.table(textConnection("    B  C  D  E  F  G 
>> 8025  1995  0  4  1  2 
>> 8025  1997  1  1  3  4 
>> 8026  1995  0  7  0  0 
>> 8026  1996  1  2  3  0 
>> 8026  1997  1  2  3  1 
>> 8026  1998  6  0  0  4 
>> 8026  1999  3  7  0  3 
>> 8027  1997  1  2  3  9 
>> 8027  1998  1  2  3  1 
>> 8027  1999  6  0  0  2 
>> 8028  1999  3  7  0  0 
>> 8029  1995  0  2  3  3 
>> 8029  1998  1  2  3  2 
>> 8029  1999  6  0  0  1"),head=TRUE,stringsAsFactors=FALSE)) 
>> 
>> a <- read.zoo(DF1, split = 1, index = 2, FUN = identity) 
>> sum.na <- function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA 
>> b <- rollapply(a, 3,  sum.na, align = "right", partial = TRUE) 
> 
> Error in FUN(cdata[st, i], ...) : unused argument(s) (partial = TRUE)
> 
> rollapply() has no argument partial.
> 
>> newDF <- lapply(1:nrow(b), function(i) 
>>               prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, 
>>                 dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) 
> 
>> names(newDF) <- time(a) 
> 
> Error in names(newDF) <- time(a) : 
>   'names' attribute [5] must be the same length as the vector [3]
> 
> newDF has only 3 names, but time(a) is of length 5.
> 
>> c<-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2)))) 
>> 
>> DF2 = data.frame(read.table(textConnection("  A  B  C 
>> 80  8025  1995 
>> 80  8026  1995 
>> 80  8029  1995 
>> 81  8026  1996 
>> 82  8025  1997 
>> 82  8026  1997 
>> 83  8025  1997 
>> 83  8027  1997 
>> 90  8026  1998 
>> 90  8027  1998 
>> 90  8029  1998 
>> 84  8026  1999 
>> 84  8027  1999 
>> 85  8028  1999 
>> 85  8029  1999"),head=TRUE,stringsAsFactors=FALSE)) 
>> 
>> e <- function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) 
>> years <- sort(unique(DF2$C)) 
>> f <- as.data.frame(embed(years, 3)) 
>> g<-lapply(split(f, f[, 1]), e) 
>> h<-lapply(g, function (x) ifelse(x>0,1,0)) 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

Sorry, I am using the devel version of zoo which allows you to use the
"partial" argument. The correct code is given below. 

I didn't get your suggestion to work. If I understand what you are trying to
do (multiplying c and h), this is likely to give the wrong results because h
contains values of 0. Since I am ultimately interested in the values of the
split matrices in c (based on the original matrices in c), this will
probable not work. Or am I just not understanding you? 

Thanks!  

# devel version of zoo
install.packages("zoo", repos = "http://r-forge.r-project.org")
library(zoo)
DF1 = data.frame(read.table(textConnection("    B  C  D  E  F  G 
8025  1995  0  4  1  2 
8025  1997  1  1  3  4 
8026  1995  0  7  0  0 
8026  1996  1  2  3  0 
8026  1997  1  2  3  1 
8026  1998  6  0  0  4 
8026  1999  3  7  0  3 
8027  1997  1  2  3  9 
8027  1998  1  2  3  1 
8027  1999  6  0  0  2 
8028  1999  3  7  0  0 
8029  1995  0  2  3  3 
8029  1998  1  2  3  2 
8029  1999  6  0  0  1"),head=TRUE,stringsAsFactors=FALSE)) 
  
a <- read.zoo(DF1, split = 1, index = 2, FUN = identity) 
sum.na <- function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA 
b <- rollapply(a, 3,  sum.na, align = "right", partial = TRUE) 
newDF <- lapply(1:nrow(b), function(i) 
              prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, 
                dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) 
names(newDF) <- time(a) 
c<-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2)))) 

DF2 = data.frame(read.table(textConnection("  A  B  C 
80  8025  1995 
80  8026  1995 
80  8029  1995 
81  8026  1996 
82  8025  1997 
82  8026  1997 
83  8025  1997 
83  8027  1997 
90  8026  1998 
90  8027  1998 
90  8029  1998 
84  8026  1999 
84  8027  1999 
85  8028  1999 
85  8029  1999"),head=TRUE,stringsAsFactors=FALSE)) 
  
e <- function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) 
years <- sort(unique(DF2$C)) 
f <- as.data.frame(embed(years, 3)) 
g<-lapply(split(f, f[, 1]), e) 
h<-lapply(g, function (x) ifelse(x>0,1,0))

years<-c(1997:1999) 
for (t in 1:length(years)) 
        { 
        year=as.character(years[t]) 
        h[[year]]<-sapply(colnames(h[[year]]), function(var)
h[[year]][h[[year]][,var]>0, h[[year]][var,]>0]) 
        } 




--
View this message in context: http://r.789695.n4.nabble.com/Selecting-cases-from-matrices-stored-in-lists-tp3759597p3760177.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list