[R] Subset and 0 replace?

William Dunlap wdunlap at tibco.com
Fri May 22 00:36:05 CEST 2015


I renamed your 'c' to be 'toyData' and your 'e' to be 'desiredResult'.  Do
you
want the following, which uses only base R code?

> vapply(toyData,
              FUN=function(V)with(V, sum(Wgt[SPCLORatingValue>16])),
              FUN.VALUE=0)
         V5          V8         V10         V44          V2
0.008714910 0.000000000 0.000000000 0.004357455 0.008714910

It what is in your desired result but in a more useful format (e.g., numbers
instead of character strings for sum).

> desiredResult
     [,1]          [,2] [,3]  [,4]          [,5]
[1,] "V5"          "V8" "V10" "V44"         "V2"
[2,] "0.008714910" "0"  "0"   "0.004357455" "0.008714910"


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, May 21, 2015 at 9:50 AM, Vin Cheng <newrnewbie at hotmail.com> wrote:

> Thanks William/Duncan!
>
> Duncan - Yes - I am using the doBy package.
>
> running this line on the sample data below gives weights for V5,V44, &
> V2.  Ideally I would like 0's for V8 and V10 in the output.
>
> So it would look like:
> e<-structure(matrix(c("V5", "0.008714910", "V8", "0", "V10", "0", "V44",
> "0.004357455", "V2", "0.008714910"),nrow = 2))
>
>
> This is far as I've gotten by subsetting and  summing:
> a<-t(summaryBy(Wgt.sum~as.numeric(.id),data=subset(ldply(c,function(x)
> summaryBy(Wgt ~ SPCLORatingValue, data=x,
> FUN=c(sum))),SPCLORatingValue>16),FUN=c(sum),order=FALSE))
>
> All help/guidance is much appreciated!  Thanks Vince!
>
> Sample data example:
> c<-structure(list(V5 = structure(list(WgtBand = c(2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333,
> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333
> ), SPCLORatingValue = c(11L, 15L, 14L, 15L, 14L, 14L, 16L, 19L,
> 13L, 17L, 11L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
> ), row.names = 12:22, class = "data.frame"), V8 = structure(list(
>     WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Wgt =
> c(0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333), SPCLORatingValue = c(14L, 15L, 15L,
>     12L, 15L, 12L, 13L, 15L, 14L, 15L, 14L)), .Names = c("WgtBand",
> "Wgt", "SPCLORatingValue"), row.names = 12:22, class = "data.frame"),
>     V10 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2,
>     2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333
>     ), SPCLORatingValue = c(15L, 13L, 14L, 14L, 13L, 13L, 13L,
>     15L, 15L, 13L, 14L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
>     ), row.names = 12:22, class = "data.frame"), V44 = structure(list(
>         WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Wgt =
> c(0.00435745520833333,
>         0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>         0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>         0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>         0.00435745520833333), SPCLORatingValue = c(13L, 14L,
>         16L, 15L, 14L, 14L, 18L, 13L, 16L, 15L, 11L)), .Names =
> c("WgtBand",
>     "Wgt", "SPCLORatingValue"), row.names = 12:22, class = "data.frame"),
>     V2 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2,
>     2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333
>     ), SPCLORatingValue = c(13L, 14L, 15L, 15L, 15L, 14L, 12L,
>     16L, 17L, 15L, 19L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
>     ), row.names = 12:22, class = "data.frame")), .Names = c("V5",
> "V8", "V10", "V44", "V2"))
> structure(list(V5 = structure(list(WgtBand = c(2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333,
> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
> 0.00435745520833333, 0.00435745520833333, 0.00435745520833333
> ), SPCLORatingValue = c(11L, 15L, 14L, 15L, 14L, 14L, 16L, 19L,
> 13L, 17L, 11L)), .Names = c("WgtBand", "Wgt", "SPCLORatingValue"
> ), row.names = 12:22, class = "data.frame"), V8 = structure(list(
>     WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Wgt =
> c(0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333), SPCLORatingValue = c(14L, 15L, 15L,
>     12L, 15L, 12L, 13L, 15L, 14L, 15L, 14L)), .Names = c("WgtBand",
> "Wgt", "SPCLORatingValue"), row.names = 12:22, class = "data.frame"),
>     V10 = structure(list(WgtBand = c(2, 2, 2, 2, 2, 2, 2, 2,
>     2, 2, 2), Wgt = c(0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333,
>     0.00435745520833333, 0.00435745520833333, 0.00435745520833333
>     ), SPCLORatingValue = c(15L, 13L, 14L, 14L, 13L, 13L, 13L,
>     15L, 15L, 13L, 14L)), .Names = c("WgtBand", "Wgt",
> "SPCLORatingValue"))))
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ------------------------------
> From: wdunlap at tibco.com
> Date: Wed, 20 May 2015 22:12:01 -0700
> Subject: Re: [R] Subset and 0 replace?
> To: newrnewbie at hotmail.com
> CC: r-help at r-project.org
>
>
> Can you show a small self-contained example of you data and expected
> results?
> I tried to make one and your expression returned a single number in a 1 by
> 1 matrix.
>
> library(doBy)
> Generation<-list(
>    data.frame(Wgt=c(1,2,4), SPCLORatingValue=c(10,11,12)),
>    data.frame(Wgt=c(8,16), SPCLORatingValue=c(15,17)),
>    data.frame(Wgt=c(32,64), SPCLORatingValue=c(19,20)))
>  t(summaryBy(Wgt.sum~as.numeric(.id),data=subset(ldply(Generation,function(x)
> summaryBy(Wgt ~ SPCLORatingValue, data=x,
> FUN=c(sum))),SPCLORatingValue>16),FUN=c(sum),order=FALSE))
> #              1
> #Wgt.sum.sum 112
> str(.Last.value)
> # num [1, 1] 112
> # - attr(*, "dimnames")=List of 2
> #  ..$ : chr "Wgt.sum.sum"
> #  ..$ : chr "1"
>
> Two ways of dealing with the problem you verbally described are
> (a) determine which elements of the input you can process (e.g., which
> have some values>16) and use subscripting on both the left and right
> side of the assignment operator to put the results in the right place.
> E.g.,
>     x <- c(-1, 1, 2)
>     ok <- x>0
>     x[ok] <- log(x[ok])
> (b) make your function handle any case so you don't have to do any
> subsetting on either side.  In your case it may be easy since
> sum(zeroLongNumericVector) is 0. In other cases you may want to use ifelse,
> as in
>    x <- c(-1, 1, 2)
>    x <- ifelse(x>0, log(x), x)
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, May 20, 2015 at 4:13 PM, Vin Cheng <newrnewbie at hotmail.com> wrote:
>
> Hi,
>
> I'm trying to group rows in a dataframe with SPCLORatingValue factor >16
> and summing the Wgt's that correspond to this condition.  There are 100
> dataframes in a list.
>
> Some of the dataframes won't have any rows that have this condition
> SPCLORatingValue>16 and therefore no corresponding weight.
>
> My problem is that I need to have a corresponding value for each dataframe
> in the list - so 100 values.
>
> If dataframe 44 doesn't have any SPCLORatingValue>16, then I end up
> getting a vector that's 99 long vs. 100.  putting value 45 into 44's slot
> and so on.
>
> Is there either an if/else statement or argument I can place into subset
> to put a 0 for the data frames that don't have SPCLORatingValue>16?
>
> GenEval[18,1:100] <-
> t(summaryBy(Wgt.sum~as.numeric(.id),data=subset(ldply(Generation,function(x)
> summaryBy(Wgt ~ SPCLORatingValue, data=x,
> FUN=c(sum))),SPCLORatingValue>16),FUN=c(sum),order=FALSE))
>
> Any help or guidance would be greatly appreciated!
> Many Thanks,
> Vince
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list