[R] What's data() for?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri May 14 13:59:42 CEST 2010
On Fri, 14 May 2010, Duncan Murdoch wrote:
> On 14/05/2010 5:35 AM, (Ted Harding) wrote:
>> On 13-May-10 23:43:58, yjmha69 wrote:
>>
>>> Hi there,
>>>
>>>
>>>> library(faraway)
>>>> pima
>>>>
>>> pregnant glucose diastolic triceps insulin bmi diabetes age test
>>> 1 6 148 72 35 0 33.6 0.627 50 1
>>> 2 1 85 66 29 0 26.6 0.351 31 0
>>>
>>>
>>>> data(pima)
>>>> pima
>>>>
>>> pregnant glucose diastolic triceps insulin bmi diabetes age test
>>> 1 6 148 72 35 0 33.6 0.627 50 1
>>> 2 1 85 66 29 0 26.6 0.351 31 0
>>>
>>> As you can see, I can already use pima without running data(pima),
>>> after running data(pima), it looks the same. So what's the reason to
>>> use data(pima) ?
>>>
>>> Thanks
>>> YJM
>>>
>>
>> The difference is that data(pima) will load the dataset pima
>> (which can be found in the package "faraway") without the use
>> of library(faraway). It won't load anything else from faraway.
>>
>
> That won't work. Unless you attach faraway, R won't know what "pima" refers
> to, and will just give an error.
But
data("pima", package="faraway")
will. And if you do that you can rm(pima); gc() and completely remove
the object from the session, something you cannot do with lazy-loading
of data.
That is I think the main attraction of not using lazy-loading for
datasets that will be used for only a small part of a session.
> The difference between data(pima) and pima is that, in this case, there isn't
> really much of one, but in other cases there might be. Prior to the
> introduction of lazy loading of data, it always made a difference: the pima
> object wouldn't be loaded into memory until requested by data(pima). With
> lazy loading, a stub for the object is always in memory, with the main part
> of the object only loaded on first use. Many packages (including faraway)
> use lazy loading of data so data() is to some extent unnecessary: but there
> are some circumstances under which lazy loading won't work, so a few packages
> don't use it, and I believe it is not the default.
>
> Duncan Murdoch
>> When you use library(faraway) you will load everything in the
>> package faraway, including of course the dataset pima (which is
>> why you see no difference, since that dataset is the same whichever
>> way you load it).
>>
>> So with data() you put less load on your system, and also avoid
>> possible conflicts between what you already have in your environment
>> and what would be brought in when you do library(faraway).
>>
>> Ted.
>>
>> --------------------------------------------------------------------
>> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
>> Fax-to-email: +44 (0)870 094 0861
>> Date: 14-May-10 Time: 10:35:15
>> ------------------------------ XFMail ------------------------------
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list