[R] List mappings and variable creation

Sundar Dorai-Raj sdorairaj at gmail.com
Mon Oct 12 16:02:34 CEST 2009


Hi, Michael,

Seems like all you need is aggregate and rbind:

x <- aggregate(saw.aggr.data["value"],
               saw.aggr.data[c("conversion.type", "filteredID", "metric")],
               sum)
x$bucketID <- "combined"
y <- rbind(saw.aggr.data, x)

Is this what you need?

--sundar

On Mon, Oct 12, 2009 at 6:09 AM, Michael Pearmain <mpearmain at google.com> wrote:
> Hi All,
>
> I have a questions about associative list mappings in R, and if they are
> possible?
>
> I have data in the form show below, and want to make a new 'bucket' variable
> called combined. Which is the sum of the control and the exposed metric
> values
> This combined variable is a many to many matching as values only appear in
> the file if they have a value > 0.
>
> conversion.type   filteredID        bucketID      Metric       Value
>    counter            true              control           a              1
>    counter            true              control           b              1
>    counter            true              control           c              2
>    counter            true              control           d              3
>
>    counter            true              exposed         a             4
>    counter            true              exposed         e             1
>
> ASIDE:
>
> At the minute i read the data into my file and and then create all the
> 'missing' row values
> (in this case,
>    counter            true              control             e             0
>    counter            true              exposed           b              0
>    counter            true              exposed           c              0
>    counter            true              exposed           d              0)
>
>
> and then run  a sort on the data, and count the number of times control
> appears, and then use this as an index matcher.
>
> saw.aggr.data <- [order(saw.aggr.data$bucketID, saw.aggr.data$metric), ]
> no.of.metrics <- length(saw.aggr.data$bucketID[grep("control",
> saw.aggr.data$bucketID)])
>
> for (i in (1:no.of.metrics)) {
>  assign(paste("combined", as.character(saw.aggr.data$metric[i])),
> (saw.aggr.data$value[i] + saw.aggr.data$value[i + no.of.metrics]))
> }
>
> This does what i want it to but is very very weak and could be open to large
> errors, ( error handling currently via grepping the names of the metric[i]
> == name of metric [i + no.of.metrics])
>
> Is there a more powerful way of doing this using some kind of list mapping?
> I've looked at the older threads in this area and it looks like something
> that should be possible but i can't figure out how to do this?
> Ideally i'd like a final dataset  / list that is of the following form.
>
> conversion.type   filteredID        bucketID      Metric       Value
>    counter            true              control           a              1
>    counter            true              control           b              1
>    counter            true              control           c              2
>    counter            true              control           d              3
>
>    counter            true              exposed         a             4
>    counter            true              exposed         e             1
>    counter            true              combined        a             5
>    counter            true              combined        b             1
>    counter            true              combined        c             2
>    counter            true              combined        d             3
>    counter            true              combined        e             1
>
> So i dont have to create the dummy variables.
>
> does this make sense?
>
> Many thanks in advance
>
> Mike
>
>
>
> --
> Michael Pearmain
> "I abhor averages.  I like the individual case.  A man may have six meals
> one day and none the next, making an average of three meals per day, but
> that is not a good way to live.  ~Louis D. Brandeis"
>
> f you received this communication by mistake, please don't forward it to
> anyone else (it may contain confidential or privileged information), please
> erase all copies of it, including all attachments, and please let the sender
> know it went to the wrong person. Thanks.
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list