[R] problem extracting data from a set of list vectors
MacQueen, Don
macqueen1 at llnl.gov
Thu Apr 19 20:37:51 CEST 2012
This looks like a correct correction.
Thanks
-Don
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
On 4/19/12 10:14 AM, "jim holtman" <jholtman at gmail.com> wrote:
>I think that instead of:
>
>obj = all.comps[[i]];
>
>you should have
>
>obj <- get(all.comps[i])
>
>Test out your programs step by step manually. Use the 'all.comps'
>object and see what happens with the various indexing modes. This is
>"debugging 101".
>
>On Thu, Apr 19, 2012 at 1:01 PM, Vining, Kelly
><Kelly.Vining at oregonstate.edu> wrote:
>> Thanks for the help, Don. Lots of good suggestions there.
>>Unfortunately, I'm still not able to access the data object. Still
>>looking for a solution. Here's the error I'm getting when I try your
>>suggestion:
>>
>> [1] "res.Callus.Explant" "res.Callus.Regen" "res.Explant.Regen"
>>> all.comps <- ls(pattern="^res")
>>> for(i in all.comps){
>> + obj = all.comps[[i]];
>> + gene.ids = rownames(obj$counts);
>> + x = data.frame(gene.ids = gene.ids, obj$e1, obj$e2, obj$log.fc,
>> + obj$p.value, obj$q.value);
>> + x = subset(x, x$obj.p.value<0.05 | x$obj.q.value<=0.1);
>> + cat("output object name is: ",paste("Diffgenes",i,sep="."),"\n");
>> + cat("output object data is: \n");
>> + print(tmp);
>> + cat("\n");
>> + }
>> Error in all.comps[[i]] : subscript out of bounds
>>
>>
>> In response to another helpful suggestion, here's the structure of this
>>data list:
>>
>>
>>> str(res.Callus.Explant)
>> List of 18
>> $ name : chr "two group comparison"
>> $ group1 : chr "Callus"
>> $ group2 : chr "Explant"
>> $ alternative : chr "two.sided"
>> $ rows : int [1:39009] 1 2 3 4 5 6 7 8 9 10 ...
>> $ counts : num [1:39009, 1:6] 0 121 237 6 7 116 6 2 860 0 ...
>> ..- attr(*, "dimnames")=List of 2
>> .. ..$ : chr [1:39009] "POPTR_0018s00200" "POPTR_0008s00200"
>>"POPTR_0004s00200" "POPTR_0019s00200" ...
>> .. ..$ : chr [1:6] "Callus_BiolRep1" "Callus_BiolRep2"
>>"Callus_BiolRep3" "Explant_BiolRep1" ...
>> $ eff.lib.sizes: Named num [1:6] 3120288 2788297 2425164 3653109
>>3810261 ...
>> ..- attr(*, "names")= chr [1:6] "V3" "V4" "V5" "V6" ...
>> $ dispersion : num [1:39009, 1:6] NA 0.0743 0.0434 0.6423 0.3554 ...
>> ..- attr(*, "dimnames")=List of 2
>> .. ..$ : chr [1:39009] "POPTR_0018s00200" "POPTR_0008s00200"
>>"POPTR_0004s00200" "POPTR_0019s00200" ...
>> .. ..$ : chr [1:6] "Callus_BiolRep1" "Callus_BiolRep2"
>>"Callus_BiolRep3" "Explant_BiolRep1" ...
>> $ x : num [1:6, 1:2] 1 1 1 1 1 1 1 1 1 0 ...
>> ..- attr(*, "dimnames")=List of 2
>> .. ..$ : chr [1:6] "Callus" "Callus" "Callus" "Explant" ...
>> .. ..$ : chr [1:2] "Intercept" "Callus-Explant"
>> $ beta0 : num [1:2] NA 0
>> $ beta.hat : num [1:39009, 1:2] NA -10.13 -9.65 -13 -12.2 ...
>> $ beta.tilde : num [1:39009, 1:2] NA -10.26 -9.74 -13.11 -12.33 ...
>> $ e : num [1:39009] NA 35.08 58.82 2.03 4.43 ...
>> $ e1 : num [1:39009] NA 30.23 53.77 1.78 3.89 ...
>> $ e2 : num [1:39009] NA 39.83 64.46 2.27 5.01 ...
>> $ log.fc : num [1:39009] NA 0.398 0.262 0.353 0.366 ...
>> $ p.values : num [1:39009] NA 0.246 0.33 0.748 0.645 ...
>> $ q.values : num [1:39009] NA 1 1 1 1 1 1 1 1 1 ...
>>
>> ________________________________________
>> From: MacQueen, Don [macqueen1 at llnl.gov]
>> Sent: Wednesday, April 18, 2012 2:42 PM
>> To: Vining, Kelly; r-help at r-project.org
>> Subject: Re: [R] problem extracting data from a set of list vectors
>>
>> Try this (NOT tested) or something similar:
>>
>> all.comps <- ls(pattern="^res")
>>
>> for(i in all.comps) {
>> obj <- all.comops[[i]]
>> gene.ids <- rownames(obj$counts)
>> x <- data.frame(gene.ids = gene.ids, obj$counts,
>> obj$e1, obj$e2,
>> obj$log.fc, obj$p.value,
>> obj$q.value)
>> x <- subset(x, obj.p.value<0.05 | obj.q.value<=0.1)
>> assign( paste('DiffGenes',i,sep='.') , x, '.GlobalEnv')
>> }
>>
>> Before you try this, make sure you have a copy of everything, or can
>>
>> reconstruct it. The assign() function is dangerous. With it you can
>> overwrite other data if you are not careful.
>>
>> You might test first; instead of using assign() as above, instead do
>> cat('output object name is: ', paste('DiffGenes',i,sep='.'),'\n')
>> cat('output object data is:\n')
>> print(tmp)
>> cat('\n')
>>
>>
>>
>> To explain a little:
>> i is the name of the data structure, not the data structure itself
>> you extract the data structure from all.comps using [[i]]
>>
>> The assign() function takes the output object (tmp in this case)
>> and writes it to the "global environment" using a name that is
>> constructed using paste().
>>
>> The global environment is the first place in your search path;
>> see search().
>>
>> Note the simplification of the subset() statement.
>>
>> You don't need semi-colons at the end of each line.
>>
>> When you construct x, you might find it helpful to name the rest of the
>> columns, not just the first one. Instead of letting it construct names.
>>
>> I re-wrapped the lines in the hopes that my email software will not
>> re-wrap them for me.
>>
>> --
>> Don MacQueen
>>
>> Lawrence Livermore National Laboratory
>> 7000 East Ave., L-627
>> Livermore, CA 94550
>> 925-423-1062
>>
>>
>>
>>
>>
>> On 4/18/12 1:13 PM, "Vining, Kelly" <Kelly.Vining at oregonstate.edu>
>>wrote:
>>
>>>Dear useRs,
>>>
>>>A colleague has sent me several batches of output I need to process, and
>>>I'm struggling with the format to the point that I don't even know how
>>>to
>>>extract a test set to upload here. My apologies, but I think that my
>>>issue is straightforward enough (for some of you, not for me!) that you
>>>can help in the absence of a test set. Here is the scenario:
>>>
>>># Data sets are lists:
>>>> ls()
>>>[1] "res.Callus.Explant" "res.Callus.Regen" "res.Explant.Regen"
>>>> is.list(res.Callus.Explant)
>>>[1] TRUE
>>>
>>># The elements of each list look like this:
>>>> names(res.Callus.Explant)
>>> [1] "name" "group1" "group2" "alternative"
>>>"rows" "counts"
>>> [7] "eff.lib.sizes" "dispersion" "x" "beta0"
>>>"beta.hat" "beta.tilde"
>>>[13] "e" "e1" "e2" "log.fc"
>>>"p.values" "q.values"
>>>
>>>I want to 1) extract specific fields from this data structure into a
>>>data
>>>frame, 2) subset from this data frame into a new data frame based on
>>>selection criteria. What I've done is this:
>>>
>>>all.comps <- ls(pattern="^res")
>>>for(i in all.comps){
>>>obj = i;
>>>gene.ids = rownames(obj$counts);
>>>x = data.frame(gene.ids = gene.ids, obj$counts, obj$e1, obj$e2,
>>>obj$log.fc,
>>>obj$p.value, obj$q.value);
>>>DiffGenes.i = subset(x, x$obj.p.value<0.05 | x$obj.q.value<=0.1)
>>>}
>>>
>>>Obviously, this doesn't work because pattern searching in the first line
>>>is not feeding the entire data structure into the all.comps variable.
>>>But
>>>how can I accomplish feeding the whole data structure for each one of
>>>these lists into the loop? Should I be able to use sapply here? If so,
>>>how? Also, I suspect that "DiffGenes.i" is not going to give me the data
>>>frame I want, which in the example I'm showing would be
>>>"DiffGenes.res.Callus.Explant." How should I name output data frames
>>>from
>>>a loop like this (if a loop is even the best way to do this)?
>>>
>>>Any help with this will be greatly appreciated.
>>>
>>>--Kelly V.
>>>
>>>______________________________________________
>>>R-help at r-project.org mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide
>>>http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>--
>Jim Holtman
>Data Munger Guru
>
>What is the problem that you are trying to solve?
>Tell me what you want to do, not how you want to do it.
More information about the R-help
mailing list