[R-sig-Geo] merging tables by columns AND row names (coordinates)

Roger Bivand Roger.Bivand at nhh.no
Fri Sep 8 09:04:46 CEST 2006


On Fri, 8 Sep 2006, Michael Sumner wrote:

> Hello, I can think of a couple of simple-minded approaches that would 
> take some time - either relying on direct string-matching for the unique 
> coordinates, or by some contrived overlay.
> 
> However, there's probably far better approaches - a couple of questions:
> 
> Can you predefine the set of all unique coordinates without reading all 
> the tables from file? 
>  - if so you might simplify the identification of each individual 
> coordinate, for matching the records
> 
> Are the coordinates (intended to be) on a regular grid?  (This seems 
> unlikely, although it is nearly true given your X coordinates).

The key question is what the data are. To me they look like a global 
regular grid with some slippage in the print() - the underlying diff() of 
the unique x's and y's is almost certainly regular. I'm not sure why they 
are in text files either (model output?). But some bits of the grid may be 
missing, the question being whether this is regular. If as an earlier 
response indicated different data sets have different grid cells 
missing, then we need the overall grid to start with, then grab the row 
and column indices (and/or grid index), and attach these to the data rows. 

If the solution needs to be robust, and have a longer term utility, I 
would go for using MySQL, Terralib, and aRT. The data representation is 
that of the Terralib Cell object, so the question would be how to upload 
to the database from the text files.

aRT is at:

http://www.est.ufpr.br/aRT/

By the way, 1M by 100 by 8 bytes is pushing 32-bit R - but handing off a 
lot of the data storage to a database relieves this greatly.

Roger

> 
> Cheers, Mike.
> 
> 
> isidora k wrote:
> > Hi everyone!
> > I have 100 tables of the form:
> > XCOORD,YCOORD,OBSERVATION
> > 27.47500,42.52641,177
> > 27.48788,42.52641,177
> > 27.50075,42.52641,179
> > 27.51362,42.52641,178
> > 27.52650,42.52641,180
> > 27.53937,42.52641,178
> > 27.55225,42.52641,181
> > 27.56512,42.52641,177
> > 27.57800,42.52641,181
> > 27.59087,42.52641,181
> > 27.60375,42.52641,180
> > 27.61662,42.52641,181
> > ..., ..., ...
> > with approximately 1000000 observations for each. All
> > these tables have the same xcoord and ycoord and I
> > would like to get a table of the form
> > XCOORD,YCOORD,OBSERVATION1,OBSERVATION2,... 
> > 27.47500,42.52641,177,233,...
> > 27.48788,42.52641,177,345,...
> > 27.50075,42.52641,179,233,...
> > 27.51362,42.52641,178,123,...
> > 27.52650,42.52641,180,178,...
> > 27.53937,42.52641,178,...,...
> > 27.55225,42.52641,181,...
> > 27.56512,42.52641,177,...
> > 27.57800,42.52641,181,...
> > 27.59087,42.52641,181,...
> > 27.60375,42.52641,180,...
> > 27.61662,42.52641,181,...
> > In other words I would like to merge all the tables
> > taking into account the common row names of their
> > xcoords AND ycoords.
> > Not all tables have the same number of observations
> > which means that not all pairs of x and y coords
> > match.
> > Is there a way to do this in R?
> > I would be grateful for any advice.
> > Many thanks
> > Isidora
> >
> > _______________________________________________
> > R-sig-Geo mailing list
> > R-sig-Geo at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
> >
> >
> 
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no




More information about the R-sig-Geo mailing list