[R] Reduce woes

Stefan Kruger stefan.kruger at gmail.com
Fri Jul 29 10:37:32 CEST 2016


Jeremiah -

neat - that's one step closer, but one small thing I still don't understand:

> data <- list(one = c(1, 1), three = c(3), two = c(2, 2))
> r = Reduce(function(acc, item) { append(acc, setNames(length(item),
names(item))) }, data, list())
> str(r)
List of 3
 $ : int 2
 $ : int 1
 $ : int 2

I wanted the names to remain, but it seems like the "data" parameter loses
its names when consumed by the Reduce()? If I print "item" inside the
reducing function, it's not got the names. I'm probably missing some
central tenet of R here.

As to your comment of this being lapply() implemented by Reduce() - as I
understand lapply()  (or map() in other functional languages), it's limited
to returning a list/vector of the same length as the original. Consider
this contrived example:

> r = Reduce(function(acc, item) { if (length(item) > 1) {append(acc,
setNames(length(item), names(item)))} }, data, list())
> str(r)
 int 2
> r
[1] 2

I don't think you could achieve that with lapply()?

Thanks

Stefan


On 28 July 2016 at 20:19, jeremiah rounds <roundsjeremiah at gmail.com> wrote:

> Basically using Reduce as an lapply in that example, but I think that was
> caused by how people started talking about things in the first place =) But
> the point is the accumulator can be anything as far as I can tell.
>
> On Thu, Jul 28, 2016 at 12:14 PM, jeremiah rounds <
> roundsjeremiah at gmail.com> wrote:
>
>> Re:
>> "What I'm trying to
>> work out is how to have the accumulator in Reduce not be the same type as
>> the elements of the vector/list being reduced - ideally it could be an S3
>> instance, list, vector, or data frame."
>>
>> Pretty sure that is not true.  See code that follows.  I would never
>> solve this task in this way though so no comment on the use of Reduce for
>> what you described.  (Note the accumulation of "functions" in a list is
>> just a demo of possibilities).  You could accumulate in an environment too
>> and potentially gain a lot of copy efficiency.
>>
>>
>> lookup = list()
>> lookup[[as.character(1)]] = function() print("1")
>> lookup[[as.character(2)]] = function() print("2")
>> lookup[[as.character(3)]] = function() print("3")
>>
>> data = list(c(1,2), c(1,4), c(3,3), c(2,30))
>>
>>
>> r = Reduce(function(acc, item) {
>> append(acc, list(lookup[[as.character(min(item))]]))
>> }, data,list())
>> r
>> for(f in r) f()
>>
>>
>> On Thu, Jul 28, 2016 at 5:09 AM, Stefan Kruger <stefan.kruger at gmail.com>
>> wrote:
>>
>>> Ulrik - many thanks for your reply.
>>>
>>> I'm aware of many simple solutions as the one you suggest, both iterative
>>> and functional style - but I'm trying to learn how to bend Reduce() for
>>> the
>>> purpose of using it in more complex processing tasks. What I'm trying to
>>> work out is how to have the accumulator in Reduce not be the same type as
>>> the elements of the vector/list being reduced - ideally it could be an S3
>>> instance, list, vector, or data frame.
>>>
>>> Here's a more realistic example (in Elixir, sorry)
>>>
>>> Given two lists:
>>>
>>> 1. data: maps an id string to a vector of revision strings
>>> 2. dict: maps known id/revision pairs as a string to true (or 1)
>>>
>>> find the items in data not already in dict, returned as a named list.
>>>
>>> ```elixir
>>> data = %{
>>>     "id1" => ["rev1.1", "rev1.2"],
>>>     "id2" => ["rev2.1"],
>>>     "id3" => ["rev3.1", "rev3.2", "rev3.3"]
>>> }
>>>
>>> dict = %{
>>>     "id1/rev1.1" => 1,
>>>     "id1/rev1.2" => 1,
>>>     "id3/rev3.1" => 1
>>> }
>>>
>>> # Find the items in data not already in dict. Return as a grouped map
>>>
>>> Map.keys(data)
>>>     |> Enum.flat_map(fn id -> Enum.map(data[id], fn rev -> {id, rev} end)
>>> end)
>>>     |> Enum.filter(fn {id, rev} -> !Dict.has_key?(dict, "#{id}/#{rev}")
>>> end)
>>>     |> Enum.reduce(%{}, fn ({k, v}, d) -> Map.update(d, k, [v], &[v|&1])
>>> end)
>>> ```
>>>
>>>
>>>
>>>
>>> On 28 July 2016 at 12:03, Ulrik Stervbo <ulrik.stervbo at gmail.com> wrote:
>>>
>>> > Hi Stefan,
>>> >
>>> > in that case,lapply(data, length) should do the trick.
>>> >
>>> > Best wishes,
>>> > Ulrik
>>> >
>>> > On Thu, 28 Jul 2016 at 12:57 Stefan Kruger <stefan.kruger at gmail.com>
>>> > wrote:
>>> >
>>> >> David - many thanks for your response.
>>> >>
>>> >> What I tried to do was to turn
>>> >>
>>> >> data <- list(one = c(1, 1), three = c(3), two = c(2, 2))
>>> >>
>>> >> into
>>> >>
>>> >> result <- list(one = 2, three = 1, two = 2)
>>> >>
>>> >> that is creating a new list which has the same names as the first, but
>>> >> where the values are the vector lengths.
>>> >>
>>> >> I know there are many other (and better) trivial ways of achieving
>>> this -
>>> >> my aim is less the task itself, and more figuring out if this can be
>>> done
>>> >> using Reduce() in the fashion I showed in the other examples I gave.
>>> It's
>>> >> a
>>> >> building block of doing map-filter-reduce type pipelines that I'd
>>> like to
>>> >> understand how to do in R.
>>> >>
>>> >> Fumbling in the dark, I tried:
>>> >>
>>> >> Reduce(function(acc, item) { setNames(c(acc, length(data[item])),
>>> item },
>>> >> names(data), accumulate=TRUE)
>>> >>
>>> >> but setNames sets all the names, not adding one - and acc is still a
>>> >> vector, not a list.
>>> >>
>>> >> It looks like 'lambda.tools.fold()' and possibly 'purrr.reduce()' aim
>>> at
>>> >> doing what I'd like to do - but I've not been able to figure out quite
>>> >> how.
>>> >>
>>> >> Thanks
>>> >>
>>> >> Stefan
>>> >>
>>> >>
>>> >>
>>> >> On 27 July 2016 at 20:35, David Winsemius <dwinsemius at comcast.net>
>>> wrote:
>>> >>
>>> >> >
>>> >> > > On Jul 27, 2016, at 8:20 AM, Stefan Kruger <
>>> stefan.kruger at gmail.com>
>>> >> > wrote:
>>> >> > >
>>> >> > > Hi -
>>> >> > >
>>> >> > > I'm new to R.
>>> >> > >
>>> >> > > In other functional languages I'm familiar with you can often
>>> seed a
>>> >> call
>>> >> > > to reduce() with a custom accumulator. Here's an example in
>>> Elixir:
>>> >> > >
>>> >> > > map = %{"one" => [1, 1], "three" => [3], "two" => [2, 2]}
>>> >> > > map |> Enum.reduce(%{}, fn ({k,v}, acc) -> Map.update(acc, k,
>>> >> > > Enum.count(v), nil) end)
>>> >> > > # %{"one" => 2, "three" => 1, "two" => 2}
>>> >> > >
>>> >> > > In R-terms that's reducing a list of vectors to become a new list
>>> >> mapping
>>> >> > > the names to the vector lengths.
>>> >> > >
>>> >> > > Even in JavaScript, you can do similar things:
>>> >> > >
>>> >> > > list = { one: [1, 1], three: [3], two: [2, 2] };
>>> >> > > var result = Object.keys(list).reduceRight(function (acc, item) {
>>> >> > >  acc[item] = list[item].length;
>>> >> > >  return acc;
>>> >> > > }, {});
>>> >> > > // result == { two: 2, three: 1, one: 2 }
>>> >> > >
>>> >> > > In R, from what I can gather, Reduce() is restricted such that any
>>> >> init
>>> >> > > value you feed it is required to be of the same type as the
>>> elements
>>> >> of
>>> >> > the
>>> >> > > vector you're reducing -- so I can't build up. So whilst I can
>>> do, say
>>> >> > >
>>> >> > >> Reduce(function(acc, item) { acc + item }, c(1,2,3,4,5), 96)
>>> >> > > [1] 111
>>> >> > >
>>> >> > > I can't use Reduce to build up a list, vector or data frame?
>>> >> > >
>>> >> > > What am I missing?
>>> >> > >
>>> >> > > Many thanks for any pointers,
>>> >> >
>>> >> > This builds a list:
>>> >> >
>>> >> > > Reduce(function(acc, item) { c(acc , item) }, c(1,2,3,4,5), 96,
>>> >> > accumulate=TRUE)
>>> >> > [[1]]
>>> >> > [1] 96
>>> >> >
>>> >> > [[2]]
>>> >> > [1] 96  1
>>> >> >
>>> >> > [[3]]
>>> >> > [1] 96  1  2
>>> >> >
>>> >> > [[4]]
>>> >> > [1] 96  1  2  3
>>> >> >
>>> >> > [[5]]
>>> >> > [1] 96  1  2  3  4
>>> >> >
>>> >> > [[6]]
>>> >> > [1] 96  1  2  3  4  5
>>> >> >
>>> >> > But you are not saying what you want. The other examples were doing
>>> >> > something with names but you provided no names for the R example.
>>> >> >
>>> >> > This would return a list of named vectors:
>>> >> >
>>> >> > > Reduce(function(acc, item) { setNames( c(acc,item), 1:(item+1))
>>> },
>>> >> > c(1,2,3,4,5), 96, accumulate=TRUE)
>>> >> > [[1]]
>>> >> > [1] 96
>>> >> >
>>> >> > [[2]]
>>> >> >  1  2
>>> >> > 96  1
>>> >> >
>>> >> > [[3]]
>>> >> >  1  2  3
>>> >> > 96  1  2
>>> >> >
>>> >> > [[4]]
>>> >> >  1  2  3  4
>>> >> > 96  1  2  3
>>> >> >
>>> >> > [[5]]
>>> >> >  1  2  3  4  5
>>> >> > 96  1  2  3  4
>>> >> >
>>> >> > [[6]]
>>> >> >  1  2  3  4  5  6
>>> >> > 96  1  2  3  4  5
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > > Stefan
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > --
>>> >> > > Stefan Kruger <stefan.kruger at gmail.com>
>>> >> > >
>>> >> > >       [[alternative HTML version deleted]]
>>> >> > >
>>> >> > > ______________________________________________
>>> >> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> >> > > https://stat.ethz.ch/mailman/listinfo/r-help
>>> >> > > PLEASE do read the posting guide
>>> >> > http://www.R-project.org/posting-guide.html
>>> >> > > and provide commented, minimal, self-contained, reproducible code.
>>> >> >
>>> >> > David Winsemius
>>> >> > Alameda, CA, USA
>>> >> >
>>> >> >
>>> >>
>>> >>
>>> >> --
>>> >> Stefan Kruger <stefan.kruger at gmail.com>
>>> >>
>>> >>         [[alternative HTML version deleted]]
>>> >>
>>> >> ______________________________________________
>>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>>> >> PLEASE do read the posting guide
>>> >> http://www.R-project.org/posting-guide.html
>>> >> and provide commented, minimal, self-contained, reproducible code.
>>> >>
>>> >
>>>
>>>
>>> --
>>> Stefan Kruger <stefan.kruger at gmail.com>
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>


-- 
Stefan Kruger <stefan.kruger at gmail.com>

	[[alternative HTML version deleted]]



More information about the R-help mailing list