[R] Subset and 0 replace?

Vin Cheng newrnewbie at hotmail.com
Fri May 22 01:26:57 CEST 2015


This is perfect!  Thanks William!!!

Vince

> On May 21, 2015, at 3:36 PM, William Dunlap <wdunlap at tibco.com> wrote:
> 
> I renamed your 'c' to be 'toyData' and your 'e' to be 'desiredResult'.  Do you
> want the following, which uses only base R code?
> 
> > vapply(toyData,
>               FUN=function(V)with(V, sum(Wgt[SPCLORatingValue>16])),
>               FUN.VALUE=0)
>          V5          V8         V10         V44          V2
> 0.008714910 0.000000000 0.000000000 0.004357455 0.008714910
> 
> It what is in your desired result but in a more useful format (e.g., numbers
> instead of character strings for sum).
> 
> > desiredResult
>      [,1]          [,2] [,3]  [,4]          [,5]
> [1,] "V5"          "V8" "V10" "V44"         "V2"
> [2,] "0.008714910" "0"  "0"   "0.004357455" "0.008714910"
> 
> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
>> On Thu, May 21, 2015 at 9:50 AM, Vin Cheng <newrnewbie at hotmail.com> wrote:
>> Thanks William/Duncan!
>>  
>> Duncan - Yes - I am using the doBy package.
>>  
>> running this line on the sample data below gives weights for V5,V44, & V2.  Ideally I would like 0's for V8 and V10 in the output.
>>  
>> So it would look like:
>> e<-structure(matrix(c("V5", "0.008714910", "V8", "0", "V10", "0", "V44", "0.004357455", "V2", "0.008714910"),nrow = 2))
>>  
>>  
>> This is far as I've gotten by subsetting and  summing:
>> a<-t(summaryBy(Wgt.sum~as.numeric(.id),data=subset(ldply(c,function(x) summaryBy(Wgt ~ SPCLORatingValue, data=x, FUN=c(sum))),SPCLORatingValue>16),FUN=c(sum),order=FALSE))
>>  
>> All help/guidance is much appreciated!  Thanks Vince!
>>  
>> Sample data example:
>> c<-structure(list(V5 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 
>> 2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333, 
>> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333
>> ), SPCLORatingValue = c(11L, 15L, 14L, 15L, 14L, 14L, 16L, 19L, 
>> 13L, 17L, 11L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
>> ), row.names = 12:22, class = "data.frame"), V8 = structure(list(
>>     WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333), SPCLORatingValue = c(14L, 15L, 15L, 
>>     12L, 15L, 12L, 13L, 15L, 14L, 15L, 14L)), .Names = c("WgtBand", 
>> "Wgt", "SPCLORatingValue"), row.names = 12:22, class = "data.frame"), 
>>     V10 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 
>>     2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333
>>     ), SPCLORatingValue = c(15L, 13L, 14L, 14L, 13L, 13L, 13L, 
>>     15L, 15L, 13L, 14L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
>>     ), row.names = 12:22, class = "data.frame"), V44 = structure(list(
>>         WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 
>>         0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>         0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>         0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>         0.00435745520833333), SPCLORatingValue = c(13L, 14L, 
>>         16L, 15L, 14L, 14L, 18L, 13L, 16L, 15L, 11L)), .Names = c("WgtBand", 
>>     "Wgt", "SPCLORatingValue"), row.names = 12:22, class = "data.frame"), 
>>     V2 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 
>>     2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333
>>     ), SPCLORatingValue = c(13L, 14L, 15L, 15L, 15L, 14L, 12L, 
>>     16L, 17L, 15L, 19L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
>>     ), row.names = 12:22, class = "data.frame")), .Names = c("V5", 
>> "V8", "V10", "V44", "V2"))
>> structure(list(V5 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 
>> 2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333, 
>> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333
>> ), SPCLORatingValue = c(11L, 15L, 14L, 15L, 14L, 14L, 16L, 19L, 
>> 13L, 17L, 11L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
>> ), row.names = 12:22, class = "data.frame"), V8 = structure(list(
>>     WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333), SPCLORatingValue = c(14L, 15L, 15L, 
>>     12L, 15L, 12L, 13L, 15L, 14L, 15L, 14L)), .Names = c("WgtBand", 
>> "Wgt", "SPCLORatingValue"), row.names = 12:22, class = "data.frame"), 
>>     V10 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 
>>     2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333, 
>>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333
>>     ), SPCLORatingValue = c(15L, 13L, 14L, 14L, 13L, 13L, 13L, 
>>     15L, 15L, 13L, 14L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"))))
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>>  
>> 
>>  
>> From: wdunlap at tibco.com
>> Date: Wed, 20 May 2015 22:12:01 -0700
>> Subject: Re: [R] Subset and 0 replace?
>> To: newrnewbie at hotmail.com
>> CC: r-help at r-project.org
>> 
>> 
>> Can you show a small self-contained example of you data and expected results?
>> I tried to make one and your expression returned a single number in a 1 by 1 matrix.
>> 
>> library(doBy)
>> Generation<-list(
>>    data.frame(Wgt=c(1,2,4), SPCLORatingValue=c(10,11,12)),
>>    data.frame(Wgt=c(8,16), SPCLORatingValue=c(15,17)),
>>    data.frame(Wgt=c(32,64), SPCLORatingValue=c(19,20)))
>>  t(summaryBy(Wgt.sum~as.numeric(.id),data=subset(ldply(Generation,function(x) summaryBy(Wgt ~ SPCLORatingValue, data=x, FUN=c(sum))),SPCLORatingValue>16),FUN=c(sum),order=FALSE))
>> #              1
>> #Wgt.sum.sum 112
>> str(.Last.value)
>> # num [1, 1] 112
>> # - attr(*, "dimnames")=List of 2
>> #  ..$ : chr "Wgt.sum.sum"
>> #  ..$ : chr "1"
>> 
>> Two ways of dealing with the problem you verbally described are
>> (a) determine which elements of the input you can process (e.g., which
>> have some values>16) and use subscripting on both the left and right
>> side of the assignment operator to put the results in the right place.  E.g.,
>>     x <- c(-1, 1, 2)
>>     ok <- x>0
>>     x[ok] <- log(x[ok])
>> (b) make your function handle any case so you don't have to do any
>> subsetting on either side.  In your case it may be easy since sum(zeroLongNumericVector) is 0. In other cases you may want to use ifelse,
>> as in
>>    x <- c(-1, 1, 2)
>>    x <- ifelse(x>0, log(x), x)
>> 
>> 
>> 
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>> 
>> On Wed, May 20, 2015 at 4:13 PM, Vin Cheng <newrnewbie at hotmail.com> wrote:
>> Hi,
>> 
>> I'm trying to group rows in a dataframe with SPCLORatingValue factor >16 and summing the Wgt's that correspond to this condition.  There are 100 dataframes in a list.
>> 
>> Some of the dataframes won't have any rows that have this condition SPCLORatingValue>16 and therefore no corresponding weight.
>> 
>> My problem is that I need to have a corresponding value for each dataframe in the list - so 100 values.
>> 
>> If dataframe 44 doesn't have any SPCLORatingValue>16, then I end up getting a vector that's 99 long vs. 100.  putting value 45 into 44's slot and so on.
>> 
>> Is there either an if/else statement or argument I can place into subset to put a 0 for the data frames that don't have SPCLORatingValue>16?
>> 
>> GenEval[18,1:100] <- t(summaryBy(Wgt.sum~as.numeric(.id),data=subset(ldply(Generation,function(x) summaryBy(Wgt ~ SPCLORatingValue, data=x, FUN=c(sum))),SPCLORatingValue>16),FUN=c(sum),order=FALSE))
>> 
>> Any help or guidance would be greatly appreciated!
>> Many Thanks,
>> Vince
>> 
>> 
>> 
>>         [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 

	[[alternative HTML version deleted]]



More information about the R-help mailing list