[Rd] Documentation examples for lm and glm
Fox, John
jfox @ending from mcm@@ter@c@
Mon Dec 17 16:23:07 CET 2018
Dear Heinz,
----------------------------------------------
> On Dec 17, 2018, at 10:19 AM, Heinz Tuechler <tuechler using gmx.at> wrote:
>
> Dear All,
>
> do you think that use of a data argument is best practice in the example below?
No, but it is *normally* or *usually* the best option, in my opinion.
Best,
John
>
> regards,
>
> Heinz
>
> ### trivial example
> plotwithline <- function(x, y) {
> plot(x, y)
> abline(lm(y~x)) ## data argument?
> }
>
> set.seed(25)
> df0 <- data.frame(x=rnorm(20), y=rnorm(20))
>
> plotwithline(df0[['x']], df0[['y']])
>
>
>
> Fox, John wrote/hat geschrieben on/am 17.12.2018 15:21:
>> Dear Martin,
>>
>> I think that everyone agrees that it’s generally preferable to use the data argument to lm() and I have nothing significant to add to the substance of the discussion, but I think that it’s a mistake not to add to the current examples, for the following reasons:
>>
>> (1) Relegating examples using the data argument to “see also” doesn’t suggest that using the argument is a best practice. Most users won’t bother to click the links.
>>
>> (2) In my opinion, an new initial example using the data argument would more clearly suggest that this is the normally the best option.
>>
>> (3) I think that it would also be desirable to add a remark to the explanation of the data argument, something like, “Although the argument is optional, it's generally preferable to specify it explicitly.” And similarly on the help page for glm().
>>
>> My two (or three) cents.
>>
>> John
>>
>> -------------------------------------------------
>> John Fox, Professor Emeritus
>> McMaster University
>> Hamilton, Ontario, Canada
>> Web: http::/socserv.mcmaster.ca/jfox
>>
>>> On Dec 17, 2018, at 3:05 AM, Martin Maechler <maechler using stat.math.ethz.ch> wrote:
>>>
>>>>>>>> David Hugh-Jones
>>>>>>>> on Sat, 15 Dec 2018 08:47:28 +0100 writes:
>>>
>>>> I would argue examples should encourage good
>>>> practice. Beginners ought to learn to keep data in data
>>>> frames and not to overuse attach().
>>>
>>> Note there's no attach() there in any of these examples!
>>>
>>>> otherwise at their own risk, but they have less need of
>>>> explicit examples.
>>>
>>> The glm examples are nice in sofar they show both uses.
>>>
>>> I agree the lm() example(s) are "didactically misleading" by
>>> not using data frames at all.
>>>
>>> I disagree that only data frame examples should be shown.
>>> If lm() is one of the first R functions a beginneR must use --
>>> because they are in a basic stats class, say -- it may be
>>> *better* didactically to focus on lm() in the very first
>>> example, and use data frames in a next one ...
>>> .... and instead of next one, we have the pretty clear comment
>>>
>>> ### less simple examples in "See Also" above
>>>
>>> I'm not convinced (but you can try more) we should change those
>>> examples or add more there.
>>>
>>> Martin
>>>
>>>> On Fri, 14 Dec 2018 at 14:51, S Ellison
>>>> <S.Ellison using lgcgroup.com> wrote:
>>>
>>>>> FWIW, before all the examples are changed to data frame
>>>>> variants, I think there's fairly good reason to have at
>>>>> least _one_ example that does _not_ place variables in a
>>>>> data frame.
>>>>>
>>>>> The data argument in lm() is optional. And there is more
>>>>> than one way to manage data in a project. I personally
>>>>> don't much like lots of stray variables lurking about,
>>>>> but if those are the only variables out there and we can
>>>>> be sure they aren't affected by other code, it's hardly
>>>>> essential to create a data frame to hold something you
>>>>> already have. Also, attach() is still part of R, for
>>>>> those folk who have a data frame but want to reference
>>>>> the contents across a wider range of functions without
>>>>> using with() a lot. lm() can reasonably omit the data
>>>>> argument there, too.
>>>>>
>>>>> So while there are good reasons to use data frames, there
>>>>> are also good reasons to provide examples that don't.
>>>>>
>>>>> Steve Ellison
>>>>>
>>>>>
>>>>>> -----Original Message----- > From: R-devel
>>>>> [mailto:r-devel-bounces using r-project.org] On Behalf Of Ben >
>>>>> Bolker > Sent: 13 December 2018 20:36 > To:
>>>>> r-devel using r-project.org > Subject: Re: [Rd] Documentation
>>>>> examples for lm and glm
>>>>>>
>>>>>>
>>>>>> Agree. Or just create the data frame with those
>>>>> variables in it > directly ...
>>>>>>
>>>>>> On 2018-12-13 3:26 p.m., Thomas Yee wrote: > > Hello,
>>>>>>>
>>>>>>> something that has been on my mind for a decade or
>>>>> two has > > been the examples for lm() and glm(). They
>>>>> encourage poor style > > because of mismanagement of data
>>>>> frames. Also, having the > > variables in a data frame
>>>>> means that predict() > > is more likely to work properly.
>>>>>>>
>>>>>>> For lm(), the variables should be put into a data
>>>>> frame. > > As 2 vectors are assigned first in the
>>>>> general workspace they > > should be deleted afterwards.
>>>>>>>
>>>>>>> For the glm(), the data frame d.AD is constructed but
>>>>> not used. Also, > > its 3 components were assigned first
>>>>> in the general workspace, so they > > float around
>>>>> dangerously afterwards like in the lm() example.
>>>>>>>
>>>>>>> Rather than attached improved .Rd files here, they
>>>>> are put at > > www.stat.auckland.ac.nz/~yee/Rdfiles > >
>>>>> You are welcome to use them!
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Thomas
>>>>>>>
>>>>>>> ______________________________________________ > >
>>>>> R-devel using r-project.org mailing list > >
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>
>>>>>> ______________________________________________ >
>>>>> R-devel using r-project.org mailing list >
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>>
>>>>> *******************************************************************
>>>>> This email and any attachments are confidential. Any
>>>>> u...{{dropped:12}}
>>>
>>>> ______________________________________________
>>>> R-devel using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list