[R] Subsetting a list of lists using lapply

William Dunlap wdunlap at tibco.com
Fri Feb 20 21:08:31 CET 2015


The elNamed(x, name) function can simplify this code a bit.  The following
gives the same
result as David W's get_shas() for the sample dataset provided:

   get_shas2 <- function (input) {
      lapply(input, function(el) elNamed(elNamed(el, "content")[[1]],
 "sha")[1])
   }

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Feb 20, 2015 at 10:56 AM, David Winsemius <dwinsemius at comcast.net>
wrote:

>
> On Feb 20, 2015, at 6:13 AM, Aron Lindberg wrote:
>
> > Hmm…Chuck’s solution may actually be problematic because there are
> several entries which at the deepest level are called “sha”, but that
> should not be included, such as:
> >
> > input[[67]]$content[[1]]$commit$tree$sh
> >
> >
> > and
> >
> > input[[67]]$content[[1]]$parents[[1]]$sha
> >
> > it’s only the “sha” that fit the following subsetting pattern that
> should be included:
> >
> >
> > input[[i]]$content[[1]]$sha[1]
> >
> >
> > It’s getting thornier!
> >
> > To be fair to Rolf’s solution (which probably can be updated to solve
> the problem), I’ve posted the complete dput here:
> >
> >
> https://gist.githubusercontent.com/aronlindberg/92700c04c88ff112e4f7/raw/0f3cd8468f4dc82267be3cec72d53a7a04f5c449/dput.R
>
> I didn't try on the larger example, but this works on the smaller one:
>
>  get_shas <- function(input){
>         x <- lapply(input, "[[", "content")
>         y <- lapply(x, "[[", 1)
>         z <- lapply(y, function(yy) if( length(names(yy)) && names(yy)
> =="sha"  ){ yy[["sha"]] })
>         }
>       sha_lists <- get_shas(input)
>
> It does deliver an entry for every leaf of the input-object which is
> either the value of "sha" or NA. I think that is not a bad thing because it
> lets you figure out where the values are coming from.
>
> >
> > --
> >
> > Aron Lindberg
> >
> >
> >
> >
> > Doctoral Candidate, Information Systems
> >
> > Weatherhead School of Management
> >
> > Case Western Reserve University
> >
> > aronlindberg.github.io
> >
> > On Fri, Feb 20, 2015 at 8:25 AM, Aron Lindberg <aron.lindberg at case.edu>
> > wrote:
> >
> >> Thanks Chuck and Rolf.
> >> While Rolf’s code also works on the dput that I actually gave you (a
> smaller subset of the full dataset), it failed to work on the larger
> dataset, because there are further exceptions:
> >> input[[i]]$content[[1]] is sometimes a list, sometimes a character
> vector, and sometimes input[[i]]$content simply returns list().
> >> Chuck’s solution however bypasses this and works on the full dataset
> (which was 8mb, which is why I didn’t upload it as a gist).
> >> Best,
> >> Aron
> >> --
> >> Aron Lindberg
> >> Doctoral Candidate, Information Systems
> >> Weatherhead School of Management
> >> Case Western Reserve University
> >> aronlindberg.github.io
> >> On Fri, Feb 20, 2015 at 12:44 AM, Charles Berry <ccberry at ucsd.edu>
> wrote:
> >>> Aron Lindberg <aron.lindberg <at> case.edu> writes:
> >>>>
> >>>> Hi Everyone,
> >>>>
> >>>> I'm working on a thorny subsetting problem involving list of lists.
> I've put a
> >>> dput of the data here:
> >>>>
> >>>>
> https://gist.githubusercontent.com/aronlindberg/b916dee897d051ac5be5/
> >>> raw/a78cbf873a7e865c3173f943ff6309ea688c653b/dput
> >>>>
> >>> IIUC, you want the value of every list element that is named "sha" and
> >>> that name will only apply to atomic objects.
> >>> If so, this should do it.
> >>>> input <- dget("/tmp/dpt")
> >>>> shas <- unlist( input, use.names=FALSE )[ grepl( "sha",
> names(unlist(input)))]
> >>>> input[[67]]$content[[1]]$sha
> >>> [1] "58cf43ecdc1beb7e1043e9de612ecc817b090f15"
> >>>> which(input[[67]]$content[[1]]$sha == shas )
> >>> [1] 194
> >>> HTH,
> >>> Chuck
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list