[Rd] Documentation examples for lm and glm

Heinz Tuechler tuechler @ending from gmx@@t
Mon Dec 17 16:19:26 CET 2018


Dear All,

do you think that use of a data argument is best practice in the example 
below?

regards,

Heinz

### trivial example
plotwithline <- function(x, y) {
     plot(x, y)
     abline(lm(y~x)) ## data argument?
}

set.seed(25)
df0 <- data.frame(x=rnorm(20), y=rnorm(20))

plotwithline(df0[['x']], df0[['y']])



Fox, John wrote/hat geschrieben on/am 17.12.2018 15:21:
> Dear Martin,
>
> I think that everyone agrees that it’s generally preferable to use the data argument to lm() and I have nothing significant to add to the substance of the discussion, but I think that it’s a mistake not to add to the current examples, for the following reasons:
>
> (1) Relegating examples using the data argument to “see also” doesn’t suggest that using the argument is a best practice. Most users won’t bother to click the links.
>
> (2) In my opinion, an new initial example using the data argument would more clearly suggest that this is the normally the best option.
>
> (3) I think that it would also be desirable to add a remark to the explanation of the data argument, something like, “Although the argument is optional, it's generally preferable to specify it explicitly.” And similarly on the help page for glm().
>
> My two (or three) cents.
>
> John
>
>   -------------------------------------------------
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: http::/socserv.mcmaster.ca/jfox
>
>> On Dec 17, 2018, at 3:05 AM, Martin Maechler <maechler using stat.math.ethz.ch> wrote:
>>
>>>>>>> David Hugh-Jones
>>>>>>>    on Sat, 15 Dec 2018 08:47:28 +0100 writes:
>>
>>> I would argue examples should encourage good
>>> practice. Beginners ought to learn to keep data in data
>>> frames and not to overuse attach().
>>
>> Note there's no attach() there in any of these examples!
>>
>>> otherwise at their own risk, but they have less need of
>>> explicit examples.
>>
>> The glm examples are nice in sofar they show both uses.
>>
>> I agree the lm() example(s) are  "didactically misleading" by
>> not using data frames at all.
>>
>> I disagree that only data frame examples should be shown.
>> If  lm()  is one of the first R functions a beginneR must use --
>> because they are in a basic stats class, say --  it may be
>> *better* didactically to focus on lm()  in the very first
>> example, and use data frames in a next one ...
>> .... and instead of next one, we have the pretty clear comment
>>
>>  ### less simple examples in "See Also" above
>>
>> I'm not convinced (but you can try more) we should change those
>> examples or add more there.
>>
>> Martin
>>
>>> On Fri, 14 Dec 2018 at 14:51, S Ellison
>>> <S.Ellison using lgcgroup.com> wrote:
>>
>>>> FWIW, before all the examples are changed to data frame
>>>> variants, I think there's fairly good reason to have at
>>>> least _one_ example that does _not_ place variables in a
>>>> data frame.
>>>>
>>>> The data argument in lm() is optional. And there is more
>>>> than one way to manage data in a project. I personally
>>>> don't much like lots of stray variables lurking about,
>>>> but if those are the only variables out there and we can
>>>> be sure they aren't affected by other code, it's hardly
>>>> essential to create a data frame to hold something you
>>>> already have.  Also, attach() is still part of R, for
>>>> those folk who have a data frame but want to reference
>>>> the contents across a wider range of functions without
>>>> using with() a lot. lm() can reasonably omit the data
>>>> argument there, too.
>>>>
>>>> So while there are good reasons to use data frames, there
>>>> are also good reasons to provide examples that don't.
>>>>
>>>> Steve Ellison
>>>>
>>>>
>>>>> -----Original Message----- > From: R-devel
>>>> [mailto:r-devel-bounces using r-project.org] On Behalf Of Ben >
>>>> Bolker > Sent: 13 December 2018 20:36 > To:
>>>> r-devel using r-project.org > Subject: Re: [Rd] Documentation
>>>> examples for lm and glm
>>>>>
>>>>>
>>>>> Agree.  Or just create the data frame with those
>>>> variables in it > directly ...
>>>>>
>>>>> On 2018-12-13 3:26 p.m., Thomas Yee wrote: > > Hello,
>>>>>>
>>>>>> something that has been on my mind for a decade or
>>>> two has > > been the examples for lm() and glm(). They
>>>> encourage poor style > > because of mismanagement of data
>>>> frames. Also, having the > > variables in a data frame
>>>> means that predict() > > is more likely to work properly.
>>>>>>
>>>>>> For lm(), the variables should be put into a data
>>>> frame.  > > As 2 vectors are assigned first in the
>>>> general workspace they > > should be deleted afterwards.
>>>>>>
>>>>>> For the glm(), the data frame d.AD is constructed but
>>>> not used. Also, > > its 3 components were assigned first
>>>> in the general workspace, so they > > float around
>>>> dangerously afterwards like in the lm() example.
>>>>>>
>>>>>> Rather than attached improved .Rd files here, they
>>>> are put at > > www.stat.auckland.ac.nz/~yee/Rdfiles > >
>>>> You are welcome to use them!
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Thomas
>>>>>>
>>>>>> ______________________________________________ > >
>>>> R-devel using r-project.org mailing list > >
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>> ______________________________________________ >
>>>> R-devel using r-project.org mailing list >
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>>
>>>> *******************************************************************
>>>> This email and any attachments are confidential. Any
>>>> u...{{dropped:12}}
>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list