[Rd] improved pairs.formula?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Mar 29 10:09:57 CEST 2005
On Tue, 29 Mar 2005, Berwin A Turlach wrote:
> Dear Brian,
>
>>>>>> "BDR" == Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:
>
> BDR> On Tue, 29 Mar 2005, Berwin A Turlach wrote:
> >> Dear all,
> >>
> >> I would like to suggest changing the pairs.formula command such
> >> that a command like
> >> [...]
>
> BDR> Perhaps you could explain what precisely 'the job' is
> Sorry, for not making it clear enough, must be that English is my
> second language. What I meant is, that with the above definition of
> pairs.formula a command like
> pairs(GNP ~ . - Year - GNP.deflator, longley)
> would only produce a pairwise scatterplot of GNP and all other
> variables in the dataframe except for Year and GNP.deflator. And,
> incidentally, so would
> pairs( ~ . - Year - GNP.deflator, longley)
Still not a description, just two examples. Since later on you took a
description to be an example, I see the confusion.
> BDR> and why you chose such an unusual piece of code to do it?
> Mmh, I don't think of it as being so unusual, most of it was gleaned
> from other R function. Well, I realise that R programing paradigms
> change over the years, so I must have gotten them from quite old
> routine.
I guess you got it from an S not R function.
> BDR> (E.g. what is the prescription for the ordering of terms, and
> BDR> why do you think the rownames of the factors and the
> BDR> variables might be in different orders? They are set the
> BDR> same in the C code.)
> In case that a user foolishly specifies a more complicated formula
> having not read the help pages. It seemed to me that this was the
> only construct to figure out which variables are actually appearing in
> terms of the formula.
Really? Please check what I wrote: `variables' and `rownames of the
factors' are always the same, apart from the response. Please show an
example where you got something different.
> BDR> BTW, the help page specifically warns against a formula of
> BDR> the type you specified.
> Does it? I searched the help page (R 2.0.1) for `.' and couldn't find
> anything about not using it in a formula. But then, again, perhaps I
> missed something with English being my second language......
>
> BDR> Why do you want to allow a response?
> Why not, the current version allows it too. At least the help page in
> my version of R states:
>
> (A response will be interpreted as another variable, but not
> treated specially, so it is confusing to use one.)
>
> Actually, I noticed that the response variable will always be in the
> top row, so it is treated specially. Try it out and compare:
>
>> pairs(longley)
>> pairs(GNP ~ ., longley)
Yes, but those are different pairs() methods. It is the same as
pairs(~ GNP + ., longley), so it is not treated specially.
> BDR> Currently only '+' is documented to work.
> The help page for pairs in my version of R states:
>
> formula: a formula, such as '~ x + y + z'. Each term will give a
> separate variable in the pairs plot, so terms should be
> numeric vectors. (A response will be interpreted as another
> variable, but not treated specially, so it is confusing to
> use one.)
>
> I don't interpret this as "only '+' is documented to work". For me
> this is just an example of an allowed formula, but an exhaustive
> listing of those formulas that are allowed. But then, despite the
> danger of repeating myself, English is only my second language....
There are several places where only the allowed form of formulae is
specified in this way. You are not allowed interactions, for example, and
it refers to `each term'. `.' is not documented to work (and used not
to).
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list