[R] a question on list manipulation

Brian Diggs diggsb at ohsu.edu
Mon Aug 8 22:51:15 CEST 2011


On 8/6/2011 9:21 AM, zhenjiang xu wrote:
> Unfortunately the list names of my real data are irregular with mixed
> digit and letters at the end. This is good idea though. It inspired me
> to give another solution based on that:
>
>> x<- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d","g"))
>> tmp<- unlist(x, use.names=F)
>> a = unlist(lapply(x, length))
>> tmp2 = rep(names(a), a)
>> x.new = split(tmp2, tmp)
>
> And I tested it on my data. It took over an hour using for loops while
> finishing in a second with the vectorization. Thanks all of you.
> Hooray~

Coming at this late, and after you already have a solution, but here is 
one using plyr:

library("plyr")

x <- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d"))

tmp <- ldply(x, function(x) {data.frame(v=unlist(x))})
dlply(tmp, .(v), function(x) {x[[".id"]]})

Or it could be combined into a single line:

dlply(ldply(x, function(x) {data.frame(v=unlist(x))}), .(v), function(x) 
{x[[".id"]]})

These will carry a few extra attributes you don't necessarily need, but 
don't really hurt anything.  I don't know how these compare timing wise 
with the other solutions.

The basic logic is turn the list into a data.frame with an ".id" column 
(original names of lists) and a "v" column (entries in original lists), 
then re-aggregate this by "v" listing the ".id"'s.

> On Fri, Aug 5, 2011 at 3:31 PM, Greg Snow<Greg.Snow at imail.org>  wrote:
>> Here is one approach, whether it is better than the basic loop or not is up to you:
>>
>>> x<- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d"))
>>>
>>> tmp<- unlist(x)
>>> tmp2<- sub( '[0-9]+$', '', names(tmp) )
>>>
>>> x.new<- split( tmp2, tmp )
>>> x.new
>> $d
>> [1] "A" "B" "C"
>>
>> $e
>> [1] "A" "B"
>>
>> $f
>> [1] "A"
>>
>>
>> Of course this version will have some problems if the names of your list elements end with digits that you don't want stripped off (but you can work around that by preprocessing the list names).
>>
>> --
>> Gregory (Greg) L. Snow Ph.D.
>> Statistical Data Center
>> Intermountain Healthcare
>> greg.snow at imail.org
>> 801.408.8111
>>
>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>>> project.org] On Behalf Of zhenjiang xu
>>> Sent: Friday, August 05, 2011 11:04 AM
>>> To: Duncan Murdoch
>>> Cc: r-help
>>> Subject: Re: [R] a question on list manipulation
>>>
>>> Exactly! Sorry I get others misunderstood. The uppercase/lowercase is
>>> only a toy example (and a bad one; yours is better than mine). My
>>> question is a more general one: a list is basically a one-to-many
>>> matching, from the names of a list to the elements belonging to each
>>> name. I'd like to reverse the matching, from all the elements to the
>>> names of the list.
>>>
>>> On Fri, Aug 5, 2011 at 12:53 PM, Duncan Murdoch
>>> <murdoch.duncan at gmail.com>  wrote:
>>>> On 05/08/2011 12:05 PM, zhenjiang xu wrote:
>>>>>
>>>>> Hi R users,
>>>>>
>>>>> I have a list:
>>>>>>   x
>>>>> $A
>>>>> [1] "a"  "b"  "c"
>>>>> $B
>>>>> [1] "b"  "c"
>>>>> $C
>>>>> [1] "c"
>>>>>
>>>>> I want to convert it to a lowercase-to-uppercase list like this:
>>>>>>   y
>>>>> $a
>>>>> [1] "A"
>>>>> $b
>>>>> [1] "A"  "B"
>>>>> $c
>>>>> [1] "A"  "B"  "C"
>>>>>
>>>>> In a word, I want to reverse the list names and the elements under
>>>>> each list name. Is there any quick way to do that? Thanks
>>>>
>>>> I interpreted this question differently from the others, and your
>>> example is
>>>> ambiguous as to which is the right interpretation.  I thought you
>>> wanted to
>>>> swap names and elements,  so
>>>>
>>>>> x<- list(A=c("d", "e", "f"), B=c("d", "e"), C=c("d"))
>>>>> x
>>>> $A
>>>> [1] "d" "e" "f"
>>>>
>>>> $B
>>>> [1] "d" "e"
>>>>
>>>> $C
>>>> [1] "d"
>>>>
>>>> would become
>>>>
>>>>> list(d=c("A", "B", "C"), e=c("A", "B"), f="A")
>>>> $d
>>>> [1] "A" "B" "C"
>>>>
>>>> $e
>>>> [1] "A" "B"
>>>>
>>>> $f
>>>> [1] "A"
>>>>
>>>> I don't know a slick way to do this; I'd just do it by brute force,
>>> looping
>>>> over the names of x.
>>>>
>>>> Duncan Murdoch
>>>>
>>>
>>>
>>>
>>> --
>>> Best,
>>> Zhenjiang
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>> guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>


-- 
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University



More information about the R-help mailing list