[R-sig-Geo] Combining polygons and calculating their area (i.e. number of cells)

Roger Bivand Roger.Bivand at nhh.no
Fri Dec 20 20:10:43 CET 2013


On Fri, 20 Dec 2013, Josh O'Brien wrote:

> On Fri, Dec 20, 2013 at 5:38 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>
>> On Thu, 19 Dec 2013, Josh O'Brien wrote:
>>
>>  On Thu, Dec 19, 2013 at 4:23 AM, Roger Bivand <Roger.Bivand at nhh.no>
>>> wrote:
>>>
>>>  On Wed, 18 Dec 2013, Josh O'Brien wrote:
>>>>
>>>>
>>> <...snip...>
>>>
>>> By the way, always avoid accessing S4 objects directly using @, do use
>>>
>>>> slot(obj, "slotname") - the sapply should read:
>>>>
>>>> area = sapply(slot(SPclus, "polygons"), slot, "area")
>>>>
>>>> for the SO version with possibly incorrect areas, and
>>>>
>>>> area = gArea(SPclus, byid=TRUE)
>>>>
>>>> for correct ones.
>>>>
>>>
>>>
>>> Would you mind explaining why the functional form, slot(obj, "slotname"),
>>> should always be used instead of obj at slotname ? I've seen this admonition
>>> repeatedly -- I think just from you -- and don't know whether it's a
>>> purely
>>> stylistic preference on your part, or whether there  is some other
>>> rationale for preferring that form.
>>>
>>
>> When sp was written (2003-5), we chose to use S4 (new style) classes. We
>> used Chambers (1998), referring to Ch. 7, and on slots pp. 290-292. There
>> the distinction between S3 (old style) "$" and "$<-" access and replacement
>> methods, and S4 "@" and particularly "@<-" was made more forcefully than in
>> Chambers (2008). Contemporary uses described in Venables and Ripley (2000)
>> also distinguish between the two.
>>
>> All of these point to the formal use of S4 class definitions, not least to
>> ensure that storage mode checking when using .C() and .Call() cease to be
>> so time-consuming. This is an issue with S3 classes, because there is
>> nothing to stop the user modifying the storage mode of list components,
>> with potentially bad consequences in compiled code. Defensive changes in
>> the underlying R engine to detect mode mismatch were introduced very much
>> later, I believe, so mode mismatch could crash the engine until them.
>>
>> For both S3 and S4 classes, the user is encouraged to use access functions
>> where provided. If the classes and methods are sufficiently well written,
>> there should only be a few occasions in which the user might want to access
>> components (S3) or slots (S4) that are not exposed via methods. If scripts
>> consistently contain @, and no access or replacement methods are provided,
>> consider asking the package maintainer to add the missing functionality.
>> slot() is a little less ugly, but the user shouldn't really need it either,
>> unless something inside an object has to be shown or manipulated.
>>
>> In this case, the "area" slot is documented, but precisely because it is
>> not intended to be used as a measure of area, there is no access method.
>>
>> The danger is that "@<-" and "$<-" are used to insert values into
>> components/slots without sufficient care being taken; access is perhaps
>> less of a problem.
>>
>> I particularly react to usages such as:
>>
>> sdf at data$var
>>
>> for sdf a Spatial*DataFrame object, as "$" and "$<-" methods *are*
>> provided to let these objects appear to be data.frame objects. This usage
>> is redundant, and displays ignorance about the class/method systems in S
>> and R. Of course, all are free to write what they like, so my preferences
>> may be just a matter of taste, but at least they are based on the books
>> written to establish the structure of the language.
>>
>> Hope this clarifies,
>>
>> Roger
>>
>
> None of that seems like just a matter of taste _except_ perhaps for the
> preference for slot(obj, "slotname") over obj at slotname (which, by the way,
> is used extensively in the sp package's code base).

Thanks for responding - use of "@" is said in the books to be OK inside 
functions, especially in the package defining the classes. I think one 
should expect package authors to maintain consistency in slot names and 
types themselves!

However, users should prefer access and replacement functions, and ask for 
more if they aren't there when needed. The same may well apply to other 
package authors importing classes - using access functions protects 
against road bumps if class definitions get changed (we get told by CRAN 
if our changes break downstream packages).

Roger


>
> Thanks for your thoughtful and enlightening reply,
>
> - Josh
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list