[R] Subsetting a list of lists using lapply
David Winsemius
dwinsemius at comcast.net
Fri Feb 20 19:56:01 CET 2015
On Feb 20, 2015, at 6:13 AM, Aron Lindberg wrote:
> Hmm…Chuck’s solution may actually be problematic because there are several entries which at the deepest level are called “sha”, but that should not be included, such as:
>
> input[[67]]$content[[1]]$commit$tree$sh
>
>
> and
>
> input[[67]]$content[[1]]$parents[[1]]$sha
>
> it’s only the “sha” that fit the following subsetting pattern that should be included:
>
>
> input[[i]]$content[[1]]$sha[1]
>
>
> It’s getting thornier!
>
> To be fair to Rolf’s solution (which probably can be updated to solve the problem), I’ve posted the complete dput here:
>
> https://gist.githubusercontent.com/aronlindberg/92700c04c88ff112e4f7/raw/0f3cd8468f4dc82267be3cec72d53a7a04f5c449/dput.R
I didn't try on the larger example, but this works on the smaller one:
get_shas <- function(input){
x <- lapply(input, "[[", "content")
y <- lapply(x, "[[", 1)
z <- lapply(y, function(yy) if( length(names(yy)) && names(yy) =="sha" ){ yy[["sha"]] })
}
sha_lists <- get_shas(input)
It does deliver an entry for every leaf of the input-object which is either the value of "sha" or NA. I think that is not a bad thing because it lets you figure out where the values are coming from.
>
> --
>
> Aron Lindberg
>
>
>
>
> Doctoral Candidate, Information Systems
>
> Weatherhead School of Management
>
> Case Western Reserve University
>
> aronlindberg.github.io
>
> On Fri, Feb 20, 2015 at 8:25 AM, Aron Lindberg <aron.lindberg at case.edu>
> wrote:
>
>> Thanks Chuck and Rolf.
>> While Rolf’s code also works on the dput that I actually gave you (a smaller subset of the full dataset), it failed to work on the larger dataset, because there are further exceptions:
>> input[[i]]$content[[1]] is sometimes a list, sometimes a character vector, and sometimes input[[i]]$content simply returns list().
>> Chuck’s solution however bypasses this and works on the full dataset (which was 8mb, which is why I didn’t upload it as a gist).
>> Best,
>> Aron
>> --
>> Aron Lindberg
>> Doctoral Candidate, Information Systems
>> Weatherhead School of Management
>> Case Western Reserve University
>> aronlindberg.github.io
>> On Fri, Feb 20, 2015 at 12:44 AM, Charles Berry <ccberry at ucsd.edu> wrote:
>>> Aron Lindberg <aron.lindberg <at> case.edu> writes:
>>>>
>>>> Hi Everyone,
>>>>
>>>> I'm working on a thorny subsetting problem involving list of lists. I've put a
>>> dput of the data here:
>>>>
>>>> https://gist.githubusercontent.com/aronlindberg/b916dee897d051ac5be5/
>>> raw/a78cbf873a7e865c3173f943ff6309ea688c653b/dput
>>>>
>>> IIUC, you want the value of every list element that is named "sha" and
>>> that name will only apply to atomic objects.
>>> If so, this should do it.
>>>> input <- dget("/tmp/dpt")
>>>> shas <- unlist( input, use.names=FALSE )[ grepl( "sha", names(unlist(input)))]
>>>> input[[67]]$content[[1]]$sha
>>> [1] "58cf43ecdc1beb7e1043e9de612ecc817b090f15"
>>>> which(input[[67]]$content[[1]]$sha == shas )
>>> [1] 194
>>> HTH,
>>> Chuck
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list