[R] Looping through values in a data frame that are >zero

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Mon May 23 22:31:05 CEST 2011


Thank you very much, everyone! Extremely helpful!

I really liked these two solutions:

# reshape is really cool + it gives me the value I need!
library(reshape2)
solution1<-subset(melt(x, id = c('y', 'z')), value > 0)

# This one is also very good, just need one additional loop - to
include variable names:
mylist<-lapply(x[,c("a","b","c")], function(zz)x[zz>0, c("y","z")])
for(i in 1:length(mylist)){
	mylist[[i]]["varname"]<-names(mylist[i])
}

Dimitri


On Sat, May 21, 2011 at 5:28 PM, Dennis Murphy <djmuser at gmail.com> wrote:
> Hi:
>
> Does this work for the first problem?
>
> library(reshape2)
> subset(melt(x, id = c('y', 'z')), value > 0)
>   y z variable value
> 3  c o        a     1
> 6  a m        b     2
> 14 d p        c     4
>
> The second problem is so convoluted I don't even know where to start...
>
> HTH,
> Dennis
>
>
> On Sat, May 21, 2011 at 6:12 AM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
>> Hello!
>>
>> I've tried for a while - but can't figure it out. I have data frame x:
>>
>> y=c("a","b","c","d","e")
>> z=c("m","n","o","p","r")
>> a=c(0,0,1,0,0)
>> b=c(2,0,0,0,0)
>> c=c(0,0,0,4,0)
>> x<-data.frame(y,z,a,b,c,stringsAsFactors=F)
>> str(x)
>> Some of the values in columns a,b, and c are >0:
>>
>> I need to write a loop through all the cells in columns a,b,c that are
>>>0 (only through them).
>> For each of those cells, I need to know:
>> 1. Name of the column it is in
>> 2 The entry of column y that is in the same row
>> 3 The entry of column z that is in the same row
>> It'd be good to save this info in a data frame somehow - so that I
>> could loop through rows of this data frame.
>>
>>
>> To explain what I need it for eventually: I have a different data
>> frame "large.df" that has the same columns (variables) - but with many
>> more entries than "x". Something like:
>> large.df<-expand.grid(y,z)
>> names(large.df)<-c("y","z")
>> set.seed(123)
>> large.df$a<-sample(0:5,75,replace=T)
>> set.seed(234)
>> large.df$b<-sample(0:5,75,replace=T)
>> set.seed(345)
>> large.df$c<-sample(0:5,75,replace=T)
>> large.df$y<-as.character(large.df$y)
>> large.df$z<-as.character(large.df$z)
>> large.df<-large.df[order(large.df$y,large.df$z),]
>> row.names(large.df)<-1:nrow(large.df)
>> (large.df);str(large.df)
>>
>> 1. Find the first cell in x that is > 0 (in this case - it's x[3,"a"].
>> 2. Find all the corresponding cells in the large.df - in this case, it's:
>> large.df[large.df$y %in% "c" & large.df$z %in% "o","a"]
>> and those 3 values can be found in rows 37:39 of large.df, in column "a".
>> 3. Take those 3 values and add to them the corresponding value in x
>> (in this case = 1) divided by their length (in this case = 3).
>> 4. Do the same for the other cells in x that are >0.
>>
>> The final result will be (sorry for lengthy code):
>>
>> large.df[large.df$y %in% "c" & large.df$z %in%
>> "o","a"]<-large.df[large.df$y %in% "c" & large.df$z %in%
>> "o","a"]+x[3,"a"]/3
>> large.df[large.df$y %in% "a" & large.df$z %in%
>> "m","b"]<-large.df[large.df$y %in% "a" & large.df$z %in%
>> "m","b"]+x[1,"b"]/3
>> large.df[large.df$y %in% "d" & large.df$z %in%
>> "p","c"]<-large.df[large.df$y %in% "d" & large.df$z %in%
>> "p","c"]+x[4,"c"]/3
>> (large.df)
>>
>> (It just happens that at the end I divide by 3 - it could be anything
>> that is length(large.df[large.df$y %in% "c" & large.df$z %in%
>> "o","a"]), etc.
>>
>>
>> Thanks a lot for your suggestions!
>>
>>
>> --
>> Dimitri Liakhovitski
>> Ninah Consulting
>> www.ninah.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com



More information about the R-help mailing list