[R] remove all terms with interaction factor in formula
William Dunlap
wdunlap at tibco.com
Thu Sep 13 20:53:01 CEST 2012
Your method would not work for, e.g., "a:d". You could look at the "factors" attribute
of a terms object and select out those columns with non-zero entries for the variables
in the interaction of interest. E.g.,
> fm <- attr(terms(~a*b*c*d), "factors")
> fm
a b c d a:b a:c b:c a:d b:d c:d a:b:c a:b:d a:c:d b:c:d a:b:c:d
a 1 0 0 0 1 1 0 1 0 0 1 1 1 0 1
b 0 1 0 0 1 0 1 0 1 0 1 1 0 1 1
c 0 0 1 0 0 1 1 0 0 1 1 0 1 1 1
d 0 0 0 1 0 0 0 1 1 1 0 1 1 1 1
> colnames(fm)[fm["b",]==0 | fm["c",]==0]
[1] "a" "b" "c" "d" "a:b" "a:c" "a:d" "b:d" "c:d" "a:b:d" "a:c:d"
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of David Winsemius
> Sent: Thursday, September 13, 2012 11:28 AM
> To: Bert Gunter
> Cc: Alexander Shenkin; r-help at r-project.org
> Subject: Re: [R] remove all terms with interaction factor in formula
>
>
> On Sep 13, 2012, at 11:00 AM, Bert Gunter wrote:
>
> > ~ a*b*d + a*c*d
>
> That seemed pretty clear and obvious, but I started wondering how to tell the machine to
> do it. Here is another idea:
>
> > grep("b:c", attr(terms(~a*b*c*d), "term.labels" ) ,invert=TRUE, value=TRUE)
> [1] "a" "b" "c" "d" "a:b" "a:c" "a:d" "b:d" "c:d" "a:b:d" "a:c:d"
>
> (Although I realize it's no longer a formula and might need to be reassembled with `paste`
> and `as.formula`.)
>
> --
> David.
>
> > -- Bert
> > On Thu, Sep 13, 2012 at 10:49 AM, Alexander Shenkin <ashenkin at ufl.edu> wrote:
> >> Hi Folks,
> >>
> >> I'm trying to find a way to remove all terms in a formula that contain a
> >> particular interaction.
> >>
> >> For example, in the formula below, I'd like to remove all terms that
> >> contain the b:c interaction.
> >>
> >>> attributes(terms( ~ a*b*c*d))$term.labels
> >> [1] "a" "b" "c" "d" "a:b" "a:c"
> >> [7] "b:c" "a:d" "b:d" "c:d" "a:b:c" "a:b:d"
> >> [13] "a:c:d" "b:c:d" "a:b:c:d"
> >>
> >> My eventual use is to fit models with the reduced formulas.
> >>
> >> For example:
> >>> my_df = data.frame( iv = runif(100), a=runif(100), b=runif(100),
> >> c=runif(100), d=runif(100))
> >>> lm(iv ~ a*b*c*d, data=my_df)
> >>
> >> I can remove particular terms with update(), but I don't see a way to
> >> remove all terms that contain a given combination of factors.
> >>
> >> Any help would be greatly appreciated. Thanks!
> >>
> >> Allie
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> >
> > Bert Gunter
> > Genentech Nonclinical Biostatistics
> >
> > Internal Contact Info:
> > Phone: 467-7374
> > Website:
> > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
> biostatistics/pdb-ncb-home.htm
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list