[R-sig-Geo] API documentation?

Roger Bivand Roger.Bivand at nhh.no
Tue Jul 1 08:51:41 CEST 2008


On Mon, 30 Jun 2008, tom sgouros wrote:

>
> Roger Bivand <Roger.Bivand at nhh.no> wrote:
>
>
>>> I don't mean to rant, but believe me, I've spent plenty of time with
>>> the documentation and it's really not helping.
>>>
>>> Partly this is a problem of R's doc format which treats package
>>> documentation as an alphabetical list of functions - which gives me no
>>> idea where to start.
>
> I would tend to second this.  I've been lurking on the list for a few
> months, hoping to learn a bit, but so far without much success, since
> the conversation and the documentation are so far above where I am and
> what I need.  I am almost familiar with R, using it for time-series
> statistics, and learned early on that the Dalgaard book was a better
> intro than any of the real R documents.  It doesn't cover nearly
> everything, but it seems to cover what I needed, so I use what is
> probably a baby-level set of R functions, but it's adequate.  Without a
> professor lurking over my shoulder to explain stuff, I am perpetually
> slightly lost, because all the documentation assumes I know stuff that I
> don't.
>
> I still use R because I know that with enough poking around it will
> eventually provide a solution.  But if the alternatives were not very
> expensive, I would have given up a while ago.
>
> I joined this group when I wanted to expand into making maps of
> geographical economic data, and after a month of working on the problem,
> I essentially had to give up for the time being.  I wish there were an
> introduction that showed me how to use R with a GIS program, but to my
> knowledge, there is not.  I did run across a GRASS book that claimed it
> would help, but as I recall, it cost upward of US$100.

I would be happy to link to such a guide from the Spatial Task View, for 
example on the R-Geo site. There are other nice resources, for instance 
Dylan Beaudette's site - one page is:

http://casoilresource.lawr.ucdavis.edu/drupal/node/100

Seen from the developer side, it is hard to know what users see as the 
most useful advice. The courses that have been provided - say like:

http://www.bias-project.org.uk/ASDARcourse/

are rather "developer-view", as indeed the forthcoming "useR" series book 
will be. By "developer-view", I mean attempting to provide information 
both for beginning users and trying to advance along the useR-developeR 
continuum where experience has shown that this may be advisable, even 
though neither desired nor immediately applauded.

A typical immediate response to the courses has been that "all that class 
and coordinate reference system stuff is unnecessary". This seems to hold 
until the participants actually get to do work with their own data, at 
which point having a reference to what is going on is handy. The specific 
difficulty, as teachers often find, is that the initial expectation from 
the user is often not the most fruitful question for helping the user 
to become more self-reliant going forward.

One clear reason for this difficulty is that many different disciplines 
use spatial data, and all of them seem to feel that they know enough for 
their internal purposes, so get frustrated when they encounter barriers 
which are inherent in their perception(s) of spatial data. So listening to 
other disciplines and learning from them can be helpful.

As far as sp classes are concerned, is the R-News note of 2005 too 
outdated to be helpful? Should it be placed more prominiently on the Task 
View page? I would acknowledge that ease of use is not what it could be, 
we are still where time series (and time representation) were in R a 
couple of years ago. However, the sp classes ought to work for many who do 
not need to manipulate the actual coordinates. For working statisticians, 
simple mapping of model residuals is no more than:

library(sp)
library(rgdal)
mydata <- readOGR(dsn="directory", layer="shpfile")
# or:
# library(maptools)
# mydata <- readShapeSpatial("shpfile.shp")
# from release 0.7-14 already submitted to CRAN, now you have to know
# whether your shapefile is point, line or polygons
myobj <- lm(response ~ x1 + x2, data=mydata)
mydata$residuals <- residuals(myobj)
spplot(mydata, "residuals")

where the mydata Spatial*DataFrame behaves as a data.frame. The two-faced 
nature of the Spatial*DataFrame classes is intentional, looking like GIS 
data models for GIS people, and data frames for statisticians. But 
manipulating coordinates is just a good deal more complicated - unless 
you just need subsetting with the "[" methods.

To summarise, contributions of user tips and examples, and links to those 
examples, would be very welcome.

Roger

>
> To make this more useful than just a rant, I would second David's point.
> What is missing is only what David misses: an introduction that says
> where to start to deal with simple geographic data, maybe providing a
> few examples of common techniques and frequent problems, and pushing
> data back and forth to some GIS.  I was not able to find that, and
> without it, found the R documentation pretty much useless.  I'd be happy
> to know of some source I hadn't found before, so if you have one to
> recommend, please do.
>
> -tom
>
>
>
>>
>> This is an inherent (and perhaps ugly) characteristic of the S4
>> object/class structure as you suggest below. New style classes are not
>> as well integrated into the documentation as straight functions
>> are. Here, coerce is as(), but the issue of how to improve
>> documentation is not resolved.
>>
>>>
>>> This then interacts badly with the OO structure. For example, look at
>>> the 20+ pages on "coerce". Hmm, what does "coerce" actually do? In
>>> fact that's in a whole different library. But I didn't know that, so I
>>> click on a page at random, say
>>>
>>> coerce,SpatialGrid,data.frame-method
>>>
>>> and this takes me to SpatialGrid class - which doesn't mention coerce
>>> at all. (Nor does it tell me what SpatialGrid is, or what it is used
>>> for.)
>>>
>>> On the other hand, maybe I might guess that to get a list of
>>> coordinates, I'd use "coordinates". So I click on that method, and it
>>> tells me yes, this "retrieves spatial coordinates". But unfortunately
>>> it retrieves them hidden inside another object ("an object of class
>>> SpatialPointsDataFrame"). OK, but how do I get the _actual_data_?
>>> Maybe the SpatialPointsDataFrame class page will tell me. Nope. Et
>>> cetera.
>>>
>>> Rick: yes, I agree that using the internal data structures is how to
>>> do things, but this is broken isn't it? The whole point of having OO
>>> is to be able to use it _without_ understanding the internal data
>>> structures. The ideal, in other words, would be to have a "thin.lines"
>>> method that I could just run on any polygon or set of polygons.
>>> Failing that, then I should be able to get at the internal data
>>> without hours of head scratching.
>>>
>>
>> No, because the underlying understanding of dp and other methods for
>> thinning is that the objects implement an arc-node topological model,
>> so that each arc can be thinned without different thinning happening
>> on otherwise identical boundaries of neighbouring polygons. But we do
>> not have an arc-node representation, so there cannot be line thinning
>> for polygon boundaries in a spaghetti world.
>>
>>> Right now, it's like, everything is hidden behind a layer of classes
>>> and slots and methods, but I still need to go behind that layer to get
>>> at the actual raw data, and this is so complicated and confusing that
>>> it would be easier just to work with the raw data.
>>>
>>
>> You need to build topology first, so if need be take the data out to a
>> GIS that does topology properly, do the arc line thinning there, and
>> bring it back in. Building topology from a stream of straight line
>> segments is a serious challenge, especially if you want to retain the
>> association with attribute data.
>>
>> Roger
>>
>>> OK, I'll stop venting. If there's anything I could do to improve this
>>> situation, I would gladly try.
>>>
>>> David Hugh-Jones
>>> PhD Candidate
>>> Essex University Department of Government
>>> http://davidhughjones.googlepages.com
>>>
>>>
>>> 2008/6/30 Virgilio Gomez-Rubio <v.gomezrubio at imperial.ac.uk>:
>>>> Dear David,
>>>>
>>>> Probably the best way to start is by checking the HTML documentation. It
>>>> should be installed locally but it is also accesible, for example, here:
>>>>
>>>> http://finzi.psych.upenn.edu/R/library/sp/html/00Index.html
>>>>
>>>> Hope this helps.
>>>>
>>>> Virgilio
>>>>
>>>> On Mon, 2008-06-30 at 18:48 +0200, David Hugh-Jones wrote:
>>>>> Thanks David for his comment about dp.
>>>>>
>>>>> Quick question: is there any reasonably comprehensible API
>>>>> documentation for the "sp" package? I have just spent about an hour
>>>>> trying to get a list of points from a SpatialPolygons object. I
>>>>> eventually just printed everything out and found the data by hand, so
>>>>> now I am doing:
>>>>>
>>>>> coords <- myobject at polygons[[1]]@Polygons[[1]]@coords
>>>>>
>>>>> but I don't assume that is right. Surely there must be some simple way
>>>>> to get a list of x and y coords out of any object?
>>>>>
>>>>> in frustration,
>>>>> David Hugh-Jones
>>>>> PhD Candidate
>>>>> Essex University Department of Government
>>>>> http://davidhughjones.googlepages.com
>>>>>
>>>>>
>>>>> 2008/6/30 David PINAUD <pinaud at cebc.cnrs.fr>:
>>>>>> maybe you can try the function dp() in the package "shapefiles", which is an
>>>>>> implementation of the Douglas-Peucker polyLine simplification algorithm.
>>>>>> Hope it helps
>>>>>> David
>>>>>>
>>>>>> David Hugh-Jones a écrit :
>>>>>>>
>>>>>>> Hi all
>>>>>>>
>>>>>>> I have a big dataset of points and am doing stuff on them that takes a
>>>>>>> lot of time. To speed it up, I would like to use "thinlines" from
>>>>>>> RArcinfo, which basically makes the maps "rougher" by throwing away
>>>>>>> points. Is there an equivalent function for SpatialPolygon type
>>>>>>> objects? (I assume that there's no way to convert _to_ Arcinfo, though
>>>>>>> I know it's possible to read from it).
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>> David Hugh-Jones
>>>>>>> PhD Candidate
>>>>>>> Essex University Department of Government
>>>>>>> http://davidhughjones.googlepages.com
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-sig-Geo mailing list
>>>>>>> R-sig-Geo at stat.math.ethz.ch
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> ***************************************************
>>>>>> David PINAUD
>>>>>> Ingénieur de Recherche "Analyses spatiales"
>>>>>>
>>>>>> Centre d'Etudes Biologiques de Chizé - CNRS UPR1934
>>>>>> 79360 Villiers-en-Bois, France poste 485
>>>>>> Tel: +33 (0)5.49.09.35.58
>>>>>> Fax: +33 (0)5.49.09.65.26
>>>>>> http://www.cebc.cnrs.fr/
>>>>>>
>>>>>> ***************************************************
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ________ Information from NOD32 ________
>>>>>> This message was checked by NOD32 Antivirus System for Linux Mail Servers.
>>>>>> http://www.eset.com
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-Geo mailing list
>>>>> R-sig-Geo at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>>
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>
>> --
>> Roger Bivand
>> Economic Geography Section, Department of Economics, Norwegian School of
>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no


More information about the R-sig-Geo mailing list