[Rd] Documentation examples for lm and glm

Achim Zeileis Achim@Zeilei@ @ending from uibk@@c@@t
Sat Dec 15 14:15:52 CET 2018


A pragmatic solution could be to create a simple linear regression example 
with variables in the global environment and then another example with a 
data.frame.

The latter might be somewhat more complex, e.g., with several regressors 
and/or mixed categorical and numeric covariates to illustrate how 
regression and analysis of (co-)variance can be combined. I like to use 
MASS's whiteside data for this:

data("whiteside", package = "MASS")
m1 <- lm(Gas ~ Temp, data = whiteside)
m2 <- lm(Gas ~ Insul + Temp, data = whiteside)
m3 <- lm(Gas ~ Insul * Temp, data = whiteside)
anova(m1, m2, m3)

Moreover, some binary response data.frame with a few covariates might be a 
useful addition to "datasets". For example a more granular version of the 
"Titanic" data (in addition to the 4-way tabel ?Titanic). Or another 
relatively straightforward data set, popular in econometrics and social 
sciences is the "Mroz" data, see e.g., help("PSID1976", package = "AER").

I would be happy to help with these if such additions were considered for 
datasets/stats.


On Sat, 15 Dec 2018, David Hugh-Jones wrote:

> I would argue examples should encourage good practice. Beginners ought to
> learn to keep data in data frames and not to overuse attach(). Experts can
> do otherwise at their own risk, but they have less need of explicit
> examples.
>
> On Fri, 14 Dec 2018 at 14:51, S Ellison <S.Ellison using lgcgroup.com> wrote:
>
>> FWIW, before all the examples are changed to data frame variants, I think
>> there's fairly good reason to have at least _one_ example that does _not_
>> place variables in a data frame.
>>
>> The data argument in lm() is optional. And there is more than one way to
>> manage data in a project. I personally don't much like lots of stray
>> variables lurking about, but if those are the only variables out there and
>> we can be sure they aren't affected by other code, it's hardly essential to
>> create a data frame to hold something you already have.
>> Also, attach() is still part of R, for those folk who have a data frame
>> but want to reference the contents across a wider range of functions
>> without using with() a lot. lm() can reasonably omit the data argument
>> there, too.
>>
>> So while there are good reasons to use data frames, there are also good
>> reasons to provide examples that don't.
>>
>> Steve Ellison
>>
>>
>>> -----Original Message-----
>>> From: R-devel [mailto:r-devel-bounces using r-project.org] On Behalf Of Ben
>>> Bolker
>>> Sent: 13 December 2018 20:36
>>> To: r-devel using r-project.org
>>> Subject: Re: [Rd] Documentation examples for lm and glm
>>>
>>>
>>>   Agree.  Or just create the data frame with those variables in it
>>> directly ...
>>>
>>> On 2018-12-13 3:26 p.m., Thomas Yee wrote:
>>>> Hello,
>>>>
>>>> something that has been on my mind for a decade or two has
>>>> been the examples for lm() and glm(). They encourage poor style
>>>> because of mismanagement of data frames. Also, having the
>>>> variables in a data frame means that predict()
>>>> is more likely to work properly.
>>>>
>>>> For lm(), the variables should be put into a data frame.
>>>> As 2 vectors are assigned first in the general workspace they
>>>> should be deleted afterwards.
>>>>
>>>> For the glm(), the data frame d.AD is constructed but not used. Also,
>>>> its 3 components were assigned first in the general workspace, so they
>>>> float around dangerously afterwards like in the lm() example.
>>>>
>>>> Rather than attached improved .Rd files here, they are put at
>>>> www.stat.auckland.ac.nz/~yee/Rdfiles
>>>> You are welcome to use them!
>>>>
>>>> Best,
>>>>
>>>> Thomas
>>>>
>>>> ______________________________________________
>>>> R-devel using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>> *******************************************************************
>> This email and any attachments are confidential. Any u...{{dropped:12}}
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list