[R] Subset and 0 replace?

Vin Cheng newrnewbie at hotmail.com
Thu May 21 18:50:20 CEST 2015


Thanks William/Duncan!
 
Duncan - Yes - I am using the doBy package.
 
running this line on the sample data below gives weights for V5,V44, & V2.  Ideally I would like 0's for V8 and V10 in the output.
 
So it would look like:
e<-structure(matrix(c("V5", "0.008714910", "V8", "0", "V10", "0", "V44", "0.004357455", "V2", "0.008714910"),nrow = 2))
 
 
This is far as I've gotten by subsetting and  summing:
a<-t(summaryBy(Wgt.sum~as.numeric(.id),data=subset(ldply(c,function(x) summaryBy(Wgt ~ SPCLORatingValue, data=x, FUN=c(sum))),SPCLORatingValue>16),FUN=c(sum),order=FALSE))
 
All help/guidance is much appreciated!  Thanks Vince!
 
Sample data example:
c<-structure(list(V5 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333, 
0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
0.00435745520833333, 0.00435745520833333, 0.00435745520833333
), SPCLORatingValue = c(11L, 15L, 14L, 15L, 14L, 14L, 16L, 19L, 
13L, 17L, 11L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
), row.names = 12:22, class = "data.frame"), V8 = structure(list(
    WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333), SPCLORatingValue = c(14L, 15L, 15L, 
    12L, 15L, 12L, 13L, 15L, 14L, 15L, 14L)), .Names = c("WgtBand", 
"Wgt", "SPCLORatingValue"), row.names = 12:22, class = "data.frame"), 
    V10 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 
    2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333
    ), SPCLORatingValue = c(15L, 13L, 14L, 14L, 13L, 13L, 13L, 
    15L, 15L, 13L, 14L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
    ), row.names = 12:22, class = "data.frame"), V44 = structure(list(
        WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 
        0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
        0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
        0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
        0.00435745520833333), SPCLORatingValue = c(13L, 14L, 
        16L, 15L, 14L, 14L, 18L, 13L, 16L, 15L, 11L)), .Names = c("WgtBand", 
    "Wgt", "SPCLORatingValue"), row.names = 12:22, class = "data.frame"), 
    V2 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 
    2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333
    ), SPCLORatingValue = c(13L, 14L, 15L, 15L, 15L, 14L, 12L, 
    16L, 17L, 15L, 19L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
    ), row.names = 12:22, class = "data.frame")), .Names = c("V5", 
"V8", "V10", "V44", "V2"))
structure(list(V5 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333, 
0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
0.00435745520833333, 0.00435745520833333, 0.00435745520833333
), SPCLORatingValue = c(11L, 15L, 14L, 15L, 14L, 14L, 16L, 19L, 
13L, 17L, 11L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
), row.names = 12:22, class = "data.frame"), V8 = structure(list(
    WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333), SPCLORatingValue = c(14L, 15L, 15L, 
    12L, 15L, 12L, 13L, 15L, 14L, 15L, 14L)), .Names = c("WgtBand", 
"Wgt", "SPCLORatingValue"), row.names = 12:22, class = "data.frame"), 
    V10 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 
    2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
    0.00435745520833333, 0.00435745520833333, 0.00435745520833333
    ), SPCLORatingValue = c(15L, 13L, 14L, 14L, 13L, 13L, 13L, 
    15L, 15L, 13L, 14L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"))))
 
 
 
 
 
 
 
 
 
 
 
 

 
From: wdunlap at tibco.com
Date: Wed, 20 May 2015 22:12:01 -0700
Subject: Re: [R] Subset and 0 replace?
To: newrnewbie at hotmail.com
CC: r-help at r-project.org

Can you show a small self-contained example of you data and expected results?I tried to make one and your expression returned a single number in a 1 by 1 matrix.
library(doBy)Generation<-list(   data.frame(Wgt=c(1,2,4), SPCLORatingValue=c(10,11,12)),   data.frame(Wgt=c(8,16), SPCLORatingValue=c(15,17)),   data.frame(Wgt=c(32,64), SPCLORatingValue=c(19,20))) t(summaryBy(Wgt.sum~as.numeric(.id),data=subset(ldply(Generation,function(x) summaryBy(Wgt ~ SPCLORatingValue, data=x, FUN=c(sum))),SPCLORatingValue>16),FUN=c(sum),order=FALSE))#              1#Wgt.sum.sum 112str(.Last.value)# num [1, 1] 112# - attr(*, "dimnames")=List of 2#  ..$ : chr "Wgt.sum.sum"#  ..$ : chr "1"
Two ways of dealing with the problem you verbally described are(a) determine which elements of the input you can process (e.g., whichhave some values>16) and use subscripting on both the left and rightside of the assignment operator to put the results in the right place.  E.g.,    x <- c(-1, 1, 2)    ok <- x>0    x[ok] <- log(x[ok])(b) make your function handle any case so you don't have to do anysubsetting on either side.  In your case it may be easy since sum(zeroLongNumericVector) is 0. In other cases you may want to use ifelse,as in   x <- c(-1, 1, 2)   x <- ifelse(x>0, log(x), x)

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, May 20, 2015 at 4:13 PM, Vin Cheng <newrnewbie at hotmail.com> wrote:
Hi,



I'm trying to group rows in a dataframe with SPCLORatingValue factor >16 and summing the Wgt's that correspond to this condition.  There are 100 dataframes in a list.



Some of the dataframes won't have any rows that have this condition SPCLORatingValue>16 and therefore no corresponding weight.



My problem is that I need to have a corresponding value for each dataframe in the list - so 100 values.



If dataframe 44 doesn't have any SPCLORatingValue>16, then I end up getting a vector that's 99 long vs. 100.  putting value 45 into 44's slot and so on.



Is there either an if/else statement or argument I can place into subset to put a 0 for the data frames that don't have SPCLORatingValue>16?



GenEval[18,1:100] <- t(summaryBy(Wgt.sum~as.numeric(.id),data=subset(ldply(Generation,function(x) summaryBy(Wgt ~ SPCLORatingValue, data=x, FUN=c(sum))),SPCLORatingValue>16),FUN=c(sum),order=FALSE))



Any help or guidance would be greatly appreciated!

Many Thanks,

Vince







        [[alternative HTML version deleted]]



______________________________________________

R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


 		 	   		  
	[[alternative HTML version deleted]]



More information about the R-help mailing list