[R] Automatic formula creation

peter dalgaard pdalgd at gmail.com
Fri Aug 9 16:00:05 CEST 2013


On Aug 9, 2013, at 13:26 , Rui Barradas wrote:

> Hello,
> 
> Maybe the following gives you some idea on how to vary the terms.
> 
> idx <- 1:5  # or any other indexes
> ftext <- paste(terms[idx], collapse = ' * ')


You're not the first to use this sort of technique - it is happening in various parts of R's own internals too, but handling R expressions via their textual representation is really not a good principle (see fortune("rethink")) and it _does_ give rise to problems. 

I much prefer techniques like this:

> nm <- lapply(letters[1:6], as.name)
> Reduce(function(a,b) bquote(.(a)*.(b)), nm)
a * b * c * d * e * f


Similarly, use

> trm <- Reduce(function(a,b) bquote(.(a)*.(b)), nm)
> formula(bquote(I(1 - Pass149) ~  .(trm) - 1))
I(1 - Pass149) ~ a * b * c * d * e * f - 1


> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> 
> Em 09-08-2013 11:40, Alex van der Spek escreveu:
>> Say I want to compare all 5 term models from a choice of 28 different
>> predictors and one known. All possible combinations of 5 out of 28 is
>> easy to form by combn(). With some string manipulation it is also easy
>> to make a text representation of a formula which is easy to convert by
>> as.formula() for use in lm().
>> 
>> The primitive part however is pasting together the terms which I do
>> explicitly for 5 terms, like so:
>> 
>> 
>>     ftext <- paste(terms[1], terms[2], terms[3], terms[4], terms[5],
>> sep = ' * ')
>> 
>> 
>> Works but is not great as I now need to edit this formula when the
>> number of terms changes. There ought to be a better way but I can't find
>> it.
>> 
>> Any help much appreciated! The full block of relevant code follows:
>> Alex van der Spek
>> 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
>> 
>> #Try all 3 band models
>> nbands <- 5
>> freqs <- c('4', '5', '6_3', '8', '10', '12_7', '16', '20', '25', '32',
>> '40', '51', '64', '81', '102', '128',
>>            '161', '203', '256', '323', '406', '512', '645', '813',
>> '1024', '1290', '1625', '2048')
>> bands <- paste(rep('kHz', 28), freqs, rep('_ave', 28), sep = '')
>> nc <- choose(28, nbands)
>> combs <- t(combn(bands, nbands))
>> 
>> models <- vector("list", nc)
>> for (ic in 1:nc) {
>>     terms <- c()
>>     for (jc in 1:nbands) {
>>         t <- paste('log10(', combs[ic, jc], ')', sep = '')
>>         terms <- append(terms, t)
>>     }
>> 
>>     ftext <- paste(terms[1], terms[2], terms[3], terms[4], terms[5],
>> sep = ' * ')
>> 
>>     ftext <- paste('I(1 - Pass149) ~ ', ftext, ' - 1', sep = '')
>>     forml <- as.formula(ftext)
>> 
>>     plus100.lm <- lm(forml, data = sd, subset = Use == 'Cal')
>>     plus100.sm <- step(plus100.lm, trace = 0)
>> }
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list