[R] Selecting cases from matrices stored in lists
mdvaan
mathijsdevaan at gmail.com
Mon Aug 22 15:45:23 CEST 2011
Jean V Adams wrote:
>
>> [R] Selecting cases from matrices stored in lists
>> mdvaan
>> to:
>> r-help
>> 08/22/2011 07:24 AM
>>
>> Hi,
>>
>> I have two lists (c and h - see below) containing matrices with similar
>> cases but different values. I want to split these matrices into multiple
>> matrices based on the values in h. So, I did the following:
>>
>> years<-c(1997:1999)
>> for (t in 1:length(years))
>> {
>> year=as.character(years[t])
>> h[[year]]<-sapply(colnames(h[[year]]), function(var)
>> h[[year]][h[[year]][,var]>0, h[[year]][var,]>0])
>> }
>>
>> Now that I have created list h (with split matrices), I would like to
> use
>> these selections to make similar selections in list c. List c needs to
> get
>> the exact same shape as h, so that `8026`in 1997 (c$`1997`$`8026`) looks
>> like this:
>>
>> $`1997`$`8026`
>> B
>> B 8025 8026 8029
>> 8025 1.0000000 0.7739527 0.9656091
>> 8026 0.7739527 1.0000000 0.7202771
>> 8029 0.9656091 0.7202771 1.0000000
>>
>> Can anyone help me doing this? I have no idea how I can get it to work.
>> Thank you very much for your help!
>>
>
> Try this:
>
> c2 <- h
> years <- names(h)
> for (t in seq(years))
> {
> year <- years[t]
> c2[[year]] <- sapply(colnames(h[[year]]), function(var)
> c[[t]][h[[year]][ ,var] > 0, h[[year]][var, ] > 0])
> }
>
> By the way, it's great that you included code in your question.
> However, I encountered a couple of errors when running you code (see
> below).
>
> Also, it would be better to use a different name for your list "c",
> because c() is a function in R.
>
> Jean
>
>>
>> library(zoo)
>> DF1 = data.frame(read.table(textConnection(" B C D E F G
>> 8025 1995 0 4 1 2
>> 8025 1997 1 1 3 4
>> 8026 1995 0 7 0 0
>> 8026 1996 1 2 3 0
>> 8026 1997 1 2 3 1
>> 8026 1998 6 0 0 4
>> 8026 1999 3 7 0 3
>> 8027 1997 1 2 3 9
>> 8027 1998 1 2 3 1
>> 8027 1999 6 0 0 2
>> 8028 1999 3 7 0 0
>> 8029 1995 0 2 3 3
>> 8029 1998 1 2 3 2
>> 8029 1999 6 0 0 1"),head=TRUE,stringsAsFactors=FALSE))
>>
>> a <- read.zoo(DF1, split = 1, index = 2, FUN = identity)
>> sum.na <- function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA
>> b <- rollapply(a, 3, sum.na, align = "right", partial = TRUE)
>
> Error in FUN(cdata[st, i], ...) : unused argument(s) (partial = TRUE)
>
> rollapply() has no argument partial.
>
>> newDF <- lapply(1:nrow(b), function(i)
>> prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE,
>> dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1))
>
>> names(newDF) <- time(a)
>
> Error in names(newDF) <- time(a) :
> 'names' attribute [5] must be the same length as the vector [3]
>
> newDF has only 3 names, but time(a) is of length 5.
>
>> c<-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2))))
>>
>> DF2 = data.frame(read.table(textConnection(" A B C
>> 80 8025 1995
>> 80 8026 1995
>> 80 8029 1995
>> 81 8026 1996
>> 82 8025 1997
>> 82 8026 1997
>> 83 8025 1997
>> 83 8027 1997
>> 90 8026 1998
>> 90 8027 1998
>> 90 8029 1998
>> 84 8026 1999
>> 84 8027 1999
>> 85 8028 1999
>> 85 8029 1999"),head=TRUE,stringsAsFactors=FALSE))
>>
>> e <- function(y) crossprod(table(DF2[DF2$C %in% y, 1:2]))
>> years <- sort(unique(DF2$C))
>> f <- as.data.frame(embed(years, 3))
>> g<-lapply(split(f, f[, 1]), e)
>> h<-lapply(g, function (x) ifelse(x>0,1,0))
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Sorry, I am using the devel version of zoo which allows you to use the
"partial" argument. The correct code is given below.
I didn't get your suggestion to work. If I understand what you are trying to
do (multiplying c and h), this is likely to give the wrong results because h
contains values of 0. Since I am ultimately interested in the values of the
split matrices in c (based on the original matrices in c), this will
probable not work. Or am I just not understanding you?
Thanks!
# devel version of zoo
install.packages("zoo", repos = "http://r-forge.r-project.org")
library(zoo)
DF1 = data.frame(read.table(textConnection(" B C D E F G
8025 1995 0 4 1 2
8025 1997 1 1 3 4
8026 1995 0 7 0 0
8026 1996 1 2 3 0
8026 1997 1 2 3 1
8026 1998 6 0 0 4
8026 1999 3 7 0 3
8027 1997 1 2 3 9
8027 1998 1 2 3 1
8027 1999 6 0 0 2
8028 1999 3 7 0 0
8029 1995 0 2 3 3
8029 1998 1 2 3 2
8029 1999 6 0 0 1"),head=TRUE,stringsAsFactors=FALSE))
a <- read.zoo(DF1, split = 1, index = 2, FUN = identity)
sum.na <- function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA
b <- rollapply(a, 3, sum.na, align = "right", partial = TRUE)
newDF <- lapply(1:nrow(b), function(i)
prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE,
dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1))
names(newDF) <- time(a)
c<-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2))))
DF2 = data.frame(read.table(textConnection(" A B C
80 8025 1995
80 8026 1995
80 8029 1995
81 8026 1996
82 8025 1997
82 8026 1997
83 8025 1997
83 8027 1997
90 8026 1998
90 8027 1998
90 8029 1998
84 8026 1999
84 8027 1999
85 8028 1999
85 8029 1999"),head=TRUE,stringsAsFactors=FALSE))
e <- function(y) crossprod(table(DF2[DF2$C %in% y, 1:2]))
years <- sort(unique(DF2$C))
f <- as.data.frame(embed(years, 3))
g<-lapply(split(f, f[, 1]), e)
h<-lapply(g, function (x) ifelse(x>0,1,0))
years<-c(1997:1999)
for (t in 1:length(years))
{
year=as.character(years[t])
h[[year]]<-sapply(colnames(h[[year]]), function(var)
h[[year]][h[[year]][,var]>0, h[[year]][var,]>0])
}
--
View this message in context: http://r.789695.n4.nabble.com/Selecting-cases-from-matrices-stored-in-lists-tp3759597p3760177.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list