[Rd] improved pairs.formula?

Berwin A Turlach berwin at maths.uwa.edu.au
Tue Mar 29 10:00:23 CEST 2005


Dear Brian,

>>>>> "BDR" == Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:

    BDR> On Tue, 29 Mar 2005, Berwin A Turlach wrote:
    >> Dear all,
    >> 
    >> I would like to suggest changing the pairs.formula command such
    >> that a command like
    >> [...]

    BDR> Perhaps you could explain what precisely 'the job' is
Sorry, for not making it clear enough, must be that English is my
second language.  What I meant is, that with the above definition of
pairs.formula a command like
        pairs(GNP ~ . - Year - GNP.deflator, longley)
would only produce a pairwise scatterplot of GNP and all other
variables in the dataframe except for Year and GNP.deflator.  And,
incidentally, so would 
        pairs( ~ . - Year - GNP.deflator, longley)

    BDR> and why you chose such an unusual piece of code to do it?
Mmh, I don't think of it as being so unusual, most of it was gleaned
from other R function.  Well, I realise that R programing paradigms
change over the years, so I must have gotten them from quite old
routine.

    BDR> (E.g. what is the prescription for the ordering of terms, and
    BDR> why do you think the rownames of the factors and the
    BDR> variables might be in different orders?  They are set the
    BDR> same in the C code.)
In case that a user foolishly specifies a more complicated formula
having not read the help pages.  It seemed to me that this was the
only construct to figure out which variables are actually appearing in
terms of the formula.

    BDR> BTW, the help page specifically warns against a formula of
    BDR> the type you specified.
Does it?  I searched the help page (R 2.0.1) for `.' and couldn't find
anything about not using it in a formula.  But then, again, perhaps I
missed something with English being my second language......

    BDR> Why do you want to allow a response?
Why not, the current version allows it too. At least the help page in
my version of R states:

        (A response will be interpreted as another variable, but not
        treated specially, so it is confusing to use one.)

Actually, I noticed that the response variable will always be in the
top row, so it is treated specially.  Try it out and compare:

> pairs(longley)
> pairs(GNP ~ ., longley)


    BDR> Currently only '+' is documented to work.
The help page for pairs in my version of R states:

 formula: a formula, such as '~ x + y + z'.  Each term will give a
          separate variable in the pairs plot, so terms should be
          numeric vectors.  (A response will be interpreted as another
          variable, but not treated specially, so it is confusing to
          use one.)

I don't interpret this as "only '+' is documented to work".  For me
this is just an example of an allowed formula, but an exhaustive
listing of those formulas that are allowed.  But then, despite the
danger of repeating myself, English is only my second language....

Best wishes,

        Berwin

========================== Full address ============================
Berwin A Turlach                      Tel.: +61 (8) 6488 3338 (secr)   
School of Mathematics and Statistics        +61 (8) 6488 3383 (self)      
The University of Western Australia   FAX : +61 (8) 6488 1028
35 Stirling Highway                   
Crawley WA 6009                e-mail: berwin at maths.uwa.edu.au
Australia                        http://www.maths.uwa.edu.au/~berwin



More information about the R-devel mailing list