[R-sig-Geo] API documentation?

Roger Bivand Roger.Bivand at nhh.no
Tue Jul 1 11:38:42 CEST 2008


On Tue, 1 Jul 2008, David Hugh-Jones wrote:

> I thought about this some more. One solution would be some wiki-like
> documentation which people could easily edit as they learnt. The
> obvious place is the R wiki ( http://wiki.r-project.org/rwiki/ ). At
> the moment there's no section on spatial data and the spatial
> statistics section just points you at the spatial view on CRAN.
>
> Things I would like to know
> - a list of external data types and how to get them into and out of R
> - a list of internal R data representations (RarcInfo, spatstat etc.)
> and how to convert between them
> - a list of things to do to data (subset, thin, measure distances,
> graphing etc.) and what packages do them
>
> I am sure other people have their own thoughts - e.g. I haven't even
> mentioned data analysis or statistics yet - so I am just going to
> start hacking at
> http://wiki.r-project.org/rwiki/doku.php?id=tips:spatial-data .
> Helpers would be welcome - people who know a lot can provide answers,
> and if (like me) you know barely anything, then you'll know what
> questions need answering.

Excellent. Please do support David's initiative - I'll add a link to the 
task view very soon.

Roger

>
> David Hugh-Jones
> PhD Candidate
> Essex University Department of Government
> http://davidhughjones.googlepages.com
>
>
> 2008/7/1 Roger Bivand <Roger.Bivand at nhh.no>:
>> On Mon, 30 Jun 2008, tom sgouros wrote:
>>
>>>
>>> Roger Bivand <Roger.Bivand at nhh.no> wrote:
>>>
>>>
>>>>> I don't mean to rant, but believe me, I've spent plenty of time with
>>>>> the documentation and it's really not helping.
>>>>>
>>>>> Partly this is a problem of R's doc format which treats package
>>>>> documentation as an alphabetical list of functions - which gives me no
>>>>> idea where to start.
>>>
>>> I would tend to second this.  I've been lurking on the list for a few
>>> months, hoping to learn a bit, but so far without much success, since
>>> the conversation and the documentation are so far above where I am and
>>> what I need.  I am almost familiar with R, using it for time-series
>>> statistics, and learned early on that the Dalgaard book was a better
>>> intro than any of the real R documents.  It doesn't cover nearly
>>> everything, but it seems to cover what I needed, so I use what is
>>> probably a baby-level set of R functions, but it's adequate.  Without a
>>> professor lurking over my shoulder to explain stuff, I am perpetually
>>> slightly lost, because all the documentation assumes I know stuff that I
>>> don't.
>>>
>>> I still use R because I know that with enough poking around it will
>>> eventually provide a solution.  But if the alternatives were not very
>>> expensive, I would have given up a while ago.
>>>
>>> I joined this group when I wanted to expand into making maps of
>>> geographical economic data, and after a month of working on the problem,
>>> I essentially had to give up for the time being.  I wish there were an
>>> introduction that showed me how to use R with a GIS program, but to my
>>> knowledge, there is not.  I did run across a GRASS book that claimed it
>>> would help, but as I recall, it cost upward of US$100.
>>
>> I would be happy to link to such a guide from the Spatial Task View, for
>> example on the R-Geo site. There are other nice resources, for instance
>> Dylan Beaudette's site - one page is:
>>
>> http://casoilresource.lawr.ucdavis.edu/drupal/node/100
>>
>> Seen from the developer side, it is hard to know what users see as the most
>> useful advice. The courses that have been provided - say like:
>>
>> http://www.bias-project.org.uk/ASDARcourse/
>>
>> are rather "developer-view", as indeed the forthcoming "useR" series book
>> will be. By "developer-view", I mean attempting to provide information both
>> for beginning users and trying to advance along the useR-developeR continuum
>> where experience has shown that this may be advisable, even though neither
>> desired nor immediately applauded.
>>
>> A typical immediate response to the courses has been that "all that class
>> and coordinate reference system stuff is unnecessary". This seems to hold
>> until the participants actually get to do work with their own data, at which
>> point having a reference to what is going on is handy. The specific
>> difficulty, as teachers often find, is that the initial expectation from the
>> user is often not the most fruitful question for helping the user to become
>> more self-reliant going forward.
>>
>> One clear reason for this difficulty is that many different disciplines use
>> spatial data, and all of them seem to feel that they know enough for their
>> internal purposes, so get frustrated when they encounter barriers which are
>> inherent in their perception(s) of spatial data. So listening to other
>> disciplines and learning from them can be helpful.
>>
>> As far as sp classes are concerned, is the R-News note of 2005 too outdated
>> to be helpful? Should it be placed more prominiently on the Task View page?
>> I would acknowledge that ease of use is not what it could be, we are still
>> where time series (and time representation) were in R a couple of years ago.
>> However, the sp classes ought to work for many who do not need to manipulate
>> the actual coordinates. For working statisticians, simple mapping of model
>> residuals is no more than:
>>
>> library(sp)
>> library(rgdal)
>> mydata <- readOGR(dsn="directory", layer="shpfile")
>> # or:
>> # library(maptools)
>> # mydata <- readShapeSpatial("shpfile.shp")
>> # from release 0.7-14 already submitted to CRAN, now you have to know
>> # whether your shapefile is point, line or polygons
>> myobj <- lm(response ~ x1 + x2, data=mydata)
>> mydata$residuals <- residuals(myobj)
>> spplot(mydata, "residuals")
>>
>> where the mydata Spatial*DataFrame behaves as a data.frame. The two-faced
>> nature of the Spatial*DataFrame classes is intentional, looking like GIS
>> data models for GIS people, and data frames for statisticians. But
>> manipulating coordinates is just a good deal more complicated - unless you
>> just need subsetting with the "[" methods.
>>
>> To summarise, contributions of user tips and examples, and links to those
>> examples, would be very welcome.
>>
>> Roger
>>
>>>
>>> To make this more useful than just a rant, I would second David's point.
>>> What is missing is only what David misses: an introduction that says
>>> where to start to deal with simple geographic data, maybe providing a
>>> few examples of common techniques and frequent problems, and pushing
>>> data back and forth to some GIS.  I was not able to find that, and
>>> without it, found the R documentation pretty much useless.  I'd be happy
>>> to know of some source I hadn't found before, so if you have one to
>>> recommend, please do.
>>>
>>> -tom
>>>
>>>
>>>
>>>>
>>>> This is an inherent (and perhaps ugly) characteristic of the S4
>>>> object/class structure as you suggest below. New style classes are not
>>>> as well integrated into the documentation as straight functions
>>>> are. Here, coerce is as(), but the issue of how to improve
>>>> documentation is not resolved.
>>>>
>>>>>
>>>>> This then interacts badly with the OO structure. For example, look at
>>>>> the 20+ pages on "coerce". Hmm, what does "coerce" actually do? In
>>>>> fact that's in a whole different library. But I didn't know that, so I
>>>>> click on a page at random, say
>>>>>
>>>>> coerce,SpatialGrid,data.frame-method
>>>>>
>>>>> and this takes me to SpatialGrid class - which doesn't mention coerce
>>>>> at all. (Nor does it tell me what SpatialGrid is, or what it is used
>>>>> for.)
>>>>>
>>>>> On the other hand, maybe I might guess that to get a list of
>>>>> coordinates, I'd use "coordinates". So I click on that method, and it
>>>>> tells me yes, this "retrieves spatial coordinates". But unfortunately
>>>>> it retrieves them hidden inside another object ("an object of class
>>>>> SpatialPointsDataFrame"). OK, but how do I get the _actual_data_?
>>>>> Maybe the SpatialPointsDataFrame class page will tell me. Nope. Et
>>>>> cetera.
>>>>>
>>>>> Rick: yes, I agree that using the internal data structures is how to
>>>>> do things, but this is broken isn't it? The whole point of having OO
>>>>> is to be able to use it _without_ understanding the internal data
>>>>> structures. The ideal, in other words, would be to have a "thin.lines"
>>>>> method that I could just run on any polygon or set of polygons.
>>>>> Failing that, then I should be able to get at the internal data
>>>>> without hours of head scratching.
>>>>>
>>>>
>>>> No, because the underlying understanding of dp and other methods for
>>>> thinning is that the objects implement an arc-node topological model,
>>>> so that each arc can be thinned without different thinning happening
>>>> on otherwise identical boundaries of neighbouring polygons. But we do
>>>> not have an arc-node representation, so there cannot be line thinning
>>>> for polygon boundaries in a spaghetti world.
>>>>
>>>>> Right now, it's like, everything is hidden behind a layer of classes
>>>>> and slots and methods, but I still need to go behind that layer to get
>>>>> at the actual raw data, and this is so complicated and confusing that
>>>>> it would be easier just to work with the raw data.
>>>>>
>>>>
>>>> You need to build topology first, so if need be take the data out to a
>>>> GIS that does topology properly, do the arc line thinning there, and
>>>> bring it back in. Building topology from a stream of straight line
>>>> segments is a serious challenge, especially if you want to retain the
>>>> association with attribute data.
>>>>
>>>> Roger
>>>>
>>>>> OK, I'll stop venting. If there's anything I could do to improve this
>>>>> situation, I would gladly try.
>>>>>
>>>>> David Hugh-Jones
>>>>> PhD Candidate
>>>>> Essex University Department of Government
>>>>> http://davidhughjones.googlepages.com
>>>>>
>>>>>
>>>>> 2008/6/30 Virgilio Gomez-Rubio <v.gomezrubio at imperial.ac.uk>:
>>>>>>
>>>>>> Dear David,
>>>>>>
>>>>>> Probably the best way to start is by checking the HTML documentation.
>>>>>> It
>>>>>> should be installed locally but it is also accesible, for example,
>>>>>> here:
>>>>>>
>>>>>> http://finzi.psych.upenn.edu/R/library/sp/html/00Index.html
>>>>>>
>>>>>> Hope this helps.
>>>>>>
>>>>>> Virgilio
>>>>>>
>>>>>> On Mon, 2008-06-30 at 18:48 +0200, David Hugh-Jones wrote:
>>>>>>>
>>>>>>> Thanks David for his comment about dp.
>>>>>>>
>>>>>>> Quick question: is there any reasonably comprehensible API
>>>>>>> documentation for the "sp" package? I have just spent about an hour
>>>>>>> trying to get a list of points from a SpatialPolygons object. I
>>>>>>> eventually just printed everything out and found the data by hand, so
>>>>>>> now I am doing:
>>>>>>>
>>>>>>> coords <- myobject at polygons[[1]]@Polygons[[1]]@coords
>>>>>>>
>>>>>>> but I don't assume that is right. Surely there must be some simple way
>>>>>>> to get a list of x and y coords out of any object?
>>>>>>>
>>>>>>> in frustration,
>>>>>>> David Hugh-Jones
>>>>>>> PhD Candidate
>>>>>>> Essex University Department of Government
>>>>>>> http://davidhughjones.googlepages.com
>>>>>>>
>>>>>>>
>>>>>>> 2008/6/30 David PINAUD <pinaud at cebc.cnrs.fr>:
>>>>>>>>
>>>>>>>> maybe you can try the function dp() in the package "shapefiles",
>>>>>>>> which is an
>>>>>>>> implementation of the Douglas-Peucker polyLine simplification
>>>>>>>> algorithm.
>>>>>>>> Hope it helps
>>>>>>>> David
>>>>>>>>
>>>>>>>> David Hugh-Jones a écrit :
>>>>>>>>>
>>>>>>>>> Hi all
>>>>>>>>>
>>>>>>>>> I have a big dataset of points and am doing stuff on them that takes
>>>>>>>>> a
>>>>>>>>> lot of time. To speed it up, I would like to use "thinlines" from
>>>>>>>>> RArcinfo, which basically makes the maps "rougher" by throwing away
>>>>>>>>> points. Is there an equivalent function for SpatialPolygon type
>>>>>>>>> objects? (I assume that there's no way to convert _to_ Arcinfo,
>>>>>>>>> though
>>>>>>>>> I know it's possible to read from it).
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>> David Hugh-Jones
>>>>>>>>> PhD Candidate
>>>>>>>>> Essex University Department of Government
>>>>>>>>> http://davidhughjones.googlepages.com
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> R-sig-Geo mailing list
>>>>>>>>> R-sig-Geo at stat.math.ethz.ch
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ***************************************************
>>>>>>>> David PINAUD
>>>>>>>> Ingénieur de Recherche "Analyses spatiales"
>>>>>>>>
>>>>>>>> Centre d'Etudes Biologiques de Chizé - CNRS UPR1934
>>>>>>>> 79360 Villiers-en-Bois, France poste 485
>>>>>>>> Tel: +33 (0)5.49.09.35.58
>>>>>>>> Fax: +33 (0)5.49.09.65.26
>>>>>>>> http://www.cebc.cnrs.fr/
>>>>>>>>
>>>>>>>> ***************************************************
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ________ Information from NOD32 ________
>>>>>>>> This message was checked by NOD32 Antivirus System for Linux Mail
>>>>>>>> Servers.
>>>>>>>> http://www.eset.com
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-sig-Geo mailing list
>>>>>>> R-sig-Geo at stat.math.ethz.ch
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-Geo mailing list
>>>>> R-sig-Geo at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>
>>>>
>>>> --
>>>> Roger Bivand
>>>> Economic Geography Section, Department of Economics, Norwegian School of
>>>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>>>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>>>> e-mail: Roger.Bivand at nhh.no
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>>
>>>
>>
>> --
>> Roger Bivand
>> Economic Geography Section, Department of Economics, Norwegian School of
>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>>
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no


More information about the R-sig-Geo mailing list