[R] Filling Lists or Arrays of variable dimensions

William Dunlap wdunlap at tibco.com
Thu Dec 20 19:38:25 CET 2012


Also note that a column of a data.frame can be a list of complicated things.
E.g.,


> d <- expand.grid(am=c(0, 1), gear=c(3,4,5))
> d$results <- I(lapply(seq_len(nrow(d)), function(i)try(lm(mpg~wt, subset=gear==d$gear[i] & am==d$am[i], data=mtcars))))	
> d[ d$am==1 & d$gear==5, "results" ]
[[1]]

Call:
lm(formula = mpg ~ wt, data = mtcars, subset = gear == d$gear[i] & 
    am == d$am[i])

Coefficients:
(Intercept)           wt  
     42.563       -8.046  

The standard printout of the data.frame doesn't look nice, but you can the
information.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: Jessica Streicher [mailto:j.streicher at micromata.de]
> Sent: Thursday, December 20, 2012 10:01 AM
> To: William Dunlap
> Cc: Chris Campbell; R help
> Subject: Re: [R] Filling Lists or Arrays of variable dimensions
> 
> Really must have been unclear at some point, sorry.
> 
> William, thats interesting, but not really helping the main problem, which is: how to do
> 
> > l[[ as.character(grid[1, ]) ]] <- 1
> 
> without having initialized the list in the loop before.
> 
> Well, or how to initialize it without having to do the loop thing, because the loop stuff
> can only be done for a specific set of parameter vectors. But those change, and i don't
> want to have to write another loop construct every time for the new version.
> 
> I want to say: hey, i have these vectors here with these values (my parameters), could
> you build me that nested list structure (tree - whatever) from it? And the function will
> give me that structure whatever i give it without me needing to intervene in form of
> changing the code.
> 
> -------------- Clarification -----------------
> 
> First: i am not computing statistics over the parameters. I'm computing stuff from other
> data, and the computation is affected by the parameters.
> 
> I am computing classifiers for different sets of parameters for those classifiers. So the
> result of doSomething() isn't a simple value. Its usually a list of 6 lists (doing cross
> validation), which in turn have the classifier object, some statistics of the classifier (e.g
> what was missclassified), and the subsets of data used in them.
> That doesn't really fit in a data.frame, hence the use of lists. I want the nested lists
> because it helps me find stuff in the object browser faster, and because all my other code
> is already geared towards it. If i had the time i might still go for a flat structure that
> everyone keeps telling me to use (got a few mails off the list),
> but i really haven't the time.
> 
> If theres no good way i'll just keep things as they are now.
> 
> 
> On 20.12.2012, at 18:37, William Dunlap wrote:
> 
> > Arranging data as a list of lists of lists of lists [...] of scalar values generally
> > will lead to slow and hard-to-read R code, mainly because R is designed to
> > work on long vectors of simple data.  If you were to start over, consider constructing
> > a data.frame with one column for each attribute.  Then tools like aggregate and
> > the plyr functions would be useful.
> >
> > However, your immediate problem may be solved by creating your 'grid' object
> > as a data.frame of character, not factor, columns because as.character works
> differently
> > on lists of scalar factors and lists of scalar characters.  Usually as.<mode>(x), when
> > x is a list of length-1 items, gives the same result as as.<mode>(unlist(x)), but not when
> > x is a list of length-1 factors:
> >
> >> height<-c("high", "low")
> >> width<-c("slim", "wide")
> >> gridF <- expand.grid(height, width, stringsAsFactors=FALSE)
> >> gridT <- expand.grid(height, width, stringsAsFactors=TRUE)
> >> as.character(gridF[1,])
> >  [1] "high" "slim"
> >> as.character(gridT[1,])
> >  [1] "1" "1"
> >> as.character(unlist(gridT[1,])) # another workaround
> >  [1] "high" "slim"
> >
> > Your example was not self-contained so I changed the call to doSomething() to
> paste(h,w,sep="/"):
> >
> >  height<-c("high", "low")
> >  width<-c("slim", "wide")
> >
> >  l <- list()
> >  for(h in height){
> >          l[[h]] <- list()
> >          for(w in width){
> >                  l[[h]][[w]] <- paste(h, w, sep="/") # doSomething()
> >          }
> >  }
> >
> >  grid <- expand.grid(height, width, stringsAsFactors=FALSE)
> >  as.character(grid[1,])
> >  # [1] "high" "slim", not the [1] "1" "1" you get with stringsAsFactors=TRUE
> >  l[[ as.character(grid[1, ]) ]]
> >  # [1] "high/slim"
> >  l[[ as.character(grid[1, ]) ]] <- 1
> >  l[[ as.character(grid[1, ]) ]]
> >  # [1] 1
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf
> >> Of Jessica Streicher
> >> Sent: Thursday, December 20, 2012 8:43 AM
> >> To: Chris Campbell
> >> Cc: R help
> >> Subject: Re: [R] Filling Lists or Arrays of variable dimensions
> >>
> >> Aggregate is highly confusing (and i would have appreciated if you used my example
> >> instead, i don't get it to do anything sensible on my stuff).
> >>
> >> And this seems not what i asked for anyway. This may be a named list but not named
> and
> >> structured as i want it at all.
> >>
> >> happy Christmas too
> >>
> >> On 20.12.2012, at 15:48, Chris Campbell wrote:
> >>
> >>> Dear Jessica
> >>>
> >>> Aggregate is a function that allows you to perform loops across multiple variables.
> >>>
> >>> tempData <- data.frame(height = rnorm(20, 100, 10),
> >>>   width = rnorm(20, 50, 5),
> >>>   par1 = rnorm(20))
> >>>
> >>> tempData$htfac <- cut(tempData$height, c(0, 100, 200))
> >>> tempData$wdfac <- cut(tempData$width, c(0, 50, 100))
> >>>
> >>> doSomething <- function(x) { mean(x) }
> >>>
> >>> out <- aggregate(tempData["par1"], tempData[c("htfac", "wdfac")], doSomething)
> >>>
> >>> # out is a data frame; this is a named list.
> >>> # use as.list to remove the data.frame class
> >>>
> >>>> as.list(out)
> >>>
> >>> $htfac
> >>> [1] (0,100]   (100,200] (0,100]   (100,200]
> >>> Levels: (0,100] (100,200]
> >>>
> >>> $wdfac
> >>> [1] (0,50]   (0,50]   (50,100] (50,100]
> >>> Levels: (0,50] (50,100]
> >>>
> >>> $par1
> >>> [1] -1.0449563 -0.3782483 -0.9319105  0.8837459
> >>>
> >>>
> >>
> >>>
> >>> I believe you are seeing an error similar to this one:
> >>>
> >>>> out[[1:3]] <- 1
> >>> Error in `[[<-`(`*tmp*`, i, value = value) :
> >>> recursive indexing failed at level 2
> >>>
> >>> This is because double square brackets for lists can only set a single list element at
> >> once; grid[1, ] is longer.
> >>
> >>>
> >>> Happy Christmas
> >>>
> >>> Chris
> >>>
> >>>
> >>> Chris Campbell
> >>> Tel. +44 (0) 1249 705 450 | Mobile. +44 (0) 7929 628 349
> >>> mailto:ccampbell at mango-solutions.com | http://www.mango-solutions.com
> >>> Mango Solutions
> >>> 2 Methuen Park
> >>> Chippenham
> >>> Wiltshire
> >>> SN14 OGB
> >>> UK
> >>>
> >>> -----Original Message-----
> >>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf
> >> Of Jessica Streicher
> >>> Sent: 20 December 2012 12:46
> >>> To: R help
> >>> Subject: [R] Filling Lists or Arrays of variable dimensions
> >>>
> >>> Following problem:
> >>>
> >>> Say you have a bunch of parameters and want to produce results for all
> combinations
> >> of those:
> >>>
> >>> height<-c("high","low")
> >>> width<-c("slim","wide")
> >>>
> >>> then what i used to do was something like this:
> >>>
> >>> l<-list()
> >>> for(h in height){
> >>> 	l[[h]]<-list()
> >>> 	for(w in width){
> >>> 		l[[h]][[w]] <- doSomething()
> >>> 	}
> >>> }
> >>>
> >>> Now those parameters aren't always the same. Their number can change and the
> >> number of entries can change, and i'd like to have one code that can handle all
> >> configurations.
> >>>
> >>> Now i thought i could use expand.grid() to get all configurations ,and than iterate
> over
> >> the rows, but the problem then is that i cannot set the values in the list like above.
> >>>
> >>> grid<-expand.grid(height,width)
> >>> l[[as.character(grid[1,])]] <-1
> >>> Error in `[[<-`(`*tmp*`, as.character(grid[1, ]), value = 1) :
> >>> no such index at level 1
> >>>
> >>> This will only work if the "path" for that is already existent, and i'm not sure how to
> >> build that in this scenario.
> >>>
> >>> I then went on and built an array instead lists of lists, but that doesn't help either
> >> because i can't access the array with what i have in the grids row - or at least i don't
> >> know how.
> >>>
> >>> Any ideas?
> >>>
> >>> I'd prefer to keep the named lists since all other code is built towards this.
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>> --
> >>>
> >>> LEGAL NOTICE
> >>>
> >>> This message is intended for the use of the named recipient(s) only and may contain
> >>> confidential and / or privileged information. If you are not the intended recipient,
> >> please
> >>> contact the sender and delete this message. Any unauthorised use of the information
> >>> contained in this message is prohibited.
> >>>
> >>> Mango Business Solutions Limited is registered in England under No. 4560258 with its
> >>> registered office at Suite 3, Middlesex House, Rutherford Close, Stevenage, Herts,
> SG1
> >> 2EF,
> >>> UK.
> >>>
> >>> PLEASE CONSIDER THE ENVIRONMENT BEFORE PRINTING THIS EMAIL
> >>>
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list